Multi-processor system - Patent 0231526

(19)

(11)

EP 0 231 526 B1

(12)	EUROPEAN PATENT SPECIFICATION

(45)	Mention of the grant of the patent:
	11.03.1992 Bulletin 1992/11

(21)	Application number: 86118136.0

(22)	Date of filing: 30.12.1986

(51)	International Patent Classification (IPC)⁵: G06F 9/38, G06F 15/16

(54)	Multi-processor system Vielfachprozessorsystem Système multiprocesseur

(84)	Designated Contracting States:
	DE FR GB

(30)

Priority:

08.01.1986 JP 563/86

(43)	Date of publication of application:
	12.08.1987 Bulletin 1987/33

(73)	Proprietor: HITACHI, LTD.
	Chiyoda-ku, Tokyo 100 (JP)

(72)	Inventors:
	Inagami, Yasuhiro 1473 Josuihoncho Kodaira-shi (JP) Nakagawa, Takayuki Hitachi Daiyon Kyoshinryo Kokubunji-shi (JP) Nagashima, Shigeo Hachioji-shi (JP)

(74)	Representative: Strehl Schübel-Hopf Groening & Partner
	Maximilianstrasse 54 80538 München 80538 München (DE)

(56)

References cited: :

EP-A- 0 042 442

EP-A- 0 123 337

NEW ELECTRONICS, vol. 18, no. 19, October 1985, pages 66-72, London, GB; J.C. MATHON: "Interfacing the 32081 as a floating point peripheral"
IEEE Micro, vol. 4, no. 3, June 1984, pages 7-19, IEEE, New York, US; B. FURHT et al.: "An efficient software driver for Am9511 arithmetic processor implementation"
PATENT ABSTRACTS OF JAPAN, vol. 6, no. 195 (P-146)[1073], 5th October 1982; & JP-A-57 105 070 (FUJITSU K.K.) 30-06-1982

Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).

Description

BACKGROUND OF THE INVENTION

FIELD OF THE INVENTION The present invention relates to a multi-processor system provided with facility for allowing synchronous communications between processors arranged in a master and slave relationship.

DESCRIPTION OF THE PRIOR ART

[0001] For the purpose of accomplishing scientific computations or calculations at an increased speed, there has been developed a high-speed processor for executing at a high speed the arithmetic operations for those arrays which occur at a high frequency in the scientific calculation. The system for processing the arithmetic operations for the arrays at a high speed may be generally classified into two categories, i.e., a vector processor designed for processing one-dimensional vectors through pipeline at a high speed and a parallel processing system including a plurality of processors arranged in parallel with one another for executing processings in parallel. Although the application of the present invention is not restricted to the vector processor or the parallel processor, it seems convenient to elucidate the problems of the hitherto known systems in conjunction with the vector processor for facilitating the understanding of the underlying concept of the present invention.

[0002] The vector processor includes a vector processing mechanism for processing through pipeline at a high speed a series of array data (vector data) ordered in a sequence. However, it is not possible to process all the vector data with a single program. There exist those data which can not but be processed through sequential processing (referred to as the scalar processing) as in the case of conventional general purpose computer. Under the circumstances, the vector processor includes in addition to the vector processing mechanism for pipeline-processing of the vector data at a high speed a scalar processing mechanism for realizing the function analogous to that of the hitherto known general purpose computer. Concerning the relationship to be established between the vector processing mechanism and the scalar processing mechanism incorporated in the vector processor, several approaches may be conceived. In many vector processors, however, the vector processing mechanism is physically separated from the scalar processing mechanism.

[0003] As an example of the processor incorporating the vector processing mechanism and the scalar processing mechanism described above, there can be mentioned a processor disclosed in GB-A- 2 113 878. The vector processor disclosed in this publication is composed of a scalar processing unit corresponding to the aforementioned scalar processing mechanism and a vector processing unit corresponding to the vector processing mechanism mentioned above.

[0004] More specifically, in the case of the processor system disclosed in GB-A- 2 113 878, the vector processor is activated only after a previous or preparatory setting procedure such as loading of address data required for the vector processing in registers incorporated in the vector processor has been executed by the scalar processor. Upon completion of the vector processing, the vector processor inform the scalar processor of the completion of vector processing by issuing an interrupt to the scalar processor or by taking advantage of the test performed by the scalar processor. On the other hand, the scalar processor executes predetermined scalar processing by utilizing the results of the vector processing. In this manner, in the case of this known system, all the data required for the vector processing are placed in the vector processor before activation of the latter. It is however noted that each of vector instructions commanding the vector processing does not require all the data to be supplied from the scalar processor. Thus, execution of such vector instruction which requires only a part of the data supplied from the scalar processor involves a problem of wasteful loss of time (dead time), because the execution of such vector instruction is allowed only after all the data have been set.

[0005] As described above, the scalar processor can perform the scalar processing after completion of the vector processing in the vector processor by utilizing the results of the vector processing. In this connection, it is also noted that each of the scalar instructions commanding the scalar processing does not require all the results of the vector processing. In other words, execution of such scalar instruction which requires only a part of the results of the vector processing has to wait for completed execution of all the vector processings, which in turn means that wasteful loss of time is involved, giving rise to an additional problem.

[0006] In EP-A- 0 042 442 is described an information processing system, comprising a main storage and a data processor. The data processor contains therein an instruction controller and a plurality of arithmetic units, the instruction controller functions to receive the instructions from the main storage and distribute the same to the respective arithmetic units for parallel execution.

[0007] A synchronization instruction (WAIT) is inserted, in advance, into the sequential instruction stream supplied from the main storage, such that the instructions, which are provided after the occurrence of said synchronization instruction (WAIT), are not executed until the execution of preceding-instructions.

SUMMARY OF THE INVENTION

[0008] It is therefore an object of the present invention to provide a multi-processor system in which individual processors are imparted with capability of performing parallel or multiple processings with improved efficiency by providing each processor with means for accomplishing fine synchronous communication control among a number of the processors.

[0009] In a multi-processor system including a master processor and at least one slave processor which requires for executing the processing operation thereof data to be made available by the master processor, when the slave processor starts the execution of an instruction which requires only a part of data available from the master processor in response to the setting of that part of data in the slave processor, erroneous operation will take place if an instruction which is to be executed in succession to the above mentioned instruction and which requires the setting of other data than the above mentioned partial data is started before the setting of the other data. For excluding such erroneous operation, it is necessary to establish, so to say, a synchronism or synchronization between the master processor and the slave processor in such a manner that only after completed execution of a certain processing, e.g., a particular instruction in the slave processor, the processing by the slave processor is interrupted until a certain processing by the master processor, e.g., loading of data required for execution of a next instruction in the slave processor has been executed, and then the processing in the slave processor is allowed to be restarted. However, the processing steps which require such synchronization will differ from one to another program. Accordingly, the synchronization has to be established on the instruction basis (i.e. instruction by instruction).

[0010] In the case of the multi-processor system according to the present invention as claimed in claim 1, the slave processor is imparted with a function capable of executing such an instruction that upon execution of an instruction for stopping temporarily the processing, a stop or pause indication is produced in the slave processor, whereby activation of the slave processor for executing a next instruction is inhibited until the stop or pause indication is reset by the master processor. On the other hand, the master processor is imparted with a function or capability to execute such an instruction with which it is checked whether the stop or pause indication is issued in the slave processor and resets the stop indication when it is issued, while generation of the stop indication is awaited if it is not issued. With this arrangement, synchronization can be established between the master processor and the slave processor on the instruction basis. More specifically, it is assumed, by way of example, that the master processor activates the slave processor by setting only the data that are required for executing a certain instruction by the slave processor, which in its turn stops the processing after execution of the aforementioned instruction. The master processor can reset the stop indication issued by the slave processor after having set the data required for execution of a next instruction to be executed by the slave processor, which can then be activated for executing the next instruction in response. In this way, the time taken for processing can be reduced significantly. It is again assumed that the master processor serves as a scalar processor with the vector processor serving as the slave processor. In that case, even if many processing steps (e.g. 100 steps) are involved in the processing for setting data in the vector processor by the scalar processor, significant reduction in the time taken for the processing can be accomplished by virtue of such arrangement that the vector processor is allowed to start the processing without waiting for the setting of all data from the scalar processor.

[0011] In order to make it possible for the master processor to perform a processing by utilizing the interim result of operation executed by the slave processor on the way of the execution, while assuring that the slave processor can continue the operation by using the data initially set by the master processor to thereby complete the arithmetic operation with high speed or within a short time, it is necessary that the slave processor informs the master processor of execution (patial completion) of the processing performed to a particular step while continuing the processing under execution so that the master processor can utilize as early as possible the result of the processing executed subsequently by the slave processor in continuation. Also in this case, since the particular processing step at which the master processor requires the interim result of the arithmetic operation performed by the slave processor differs from one to another program, it is necessary for the slave processor to inform the master processor of the completion of execution of instruction on the instruction basis.

[0012] In the multi-processor system according to the present invention as claimed in claim 3, the slave processor is imparted with such a function to execute an instruction for indicating completion of execution of a succeeding instruction by discriminatively detecting the completed execution of the succeeding one, while the master processor is imparted with a function or capability to execute an instruction for checking whether the completed execution is indicated by the slave processor to thereby reset the indication of completed execution if it is issued and otherwise wait for generation of the indication of completed execution.

[0013] When the slave processor is constituted by the vector processor, the arithmetic unit, vector register and others participating in the execution of instructions may differ from one to another instruction. To deal with such situation, the slave processor may be provided with means for storing the decoded result of a succeeding instruction (i.e. information associated with the execution of that instruction) in response to an instruction for identifying the completion of execution of that succeeding instruction, wherein the decoded result is comparatively collated with information produced by the slave processor as the result of completion of execution of that instruction, to thereby identify the completion of execution of the instruction.

[0014] With the arrangements described above, the slave processor can perform arithmetic operation under the command of the master processor and issue indication of completion of execution of a particular instruction, as occasion requires, on the way of the operation, while the master processor can detect discriminatively completion of execution of a particular instruction performed by the slave processor to thereby carry out arithmetic operation by utilizing the results of the arithmetic operation performed by the slave processor up to that time point, whereby the overall processing time can be shortened significantly. When the master processor is the scalar processor with the slave processor being the vector processor, the scalar processor can perform processing by utilizing the interim result available from the vector processor on the way of operation in which the time for executing a vector instruction requires a relatively long time. Thus, significant reduction in the processing time can be attained.

[0015] Implementation of the aforementioned functions in the master processor and the slave processor can be realized by addition and modification of logic circuits of the conventional processor on a relatively small scale. The synchronization control according to the invention scarcely exerts any serious influence to the instruction sequence adopted heretofore. Accordingly, burden to be borne by language compilers and others due to application of the present invention can be very small.

[0016] In summary, in a system including a plurality of processors which are interconnected in master and slave relation, the slave processor can stop the instruction activation processing at any given time point while the master processor can clear or remove the stop or pause. Further, indication of the occurrence of event in the slave processor can be made on the instruction basis. On the other hand, the master processor can stop temporarily the processing until the occurrence of event is indicated. Besides, fine synchronization control can be effectuated between the master processor and the slave processor on the instruction basis. These are main advantages attendant on the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] These and other objects and advantages of the present invention will become more apparent upon reading the following detailed description taken in conjunction with the drawings, in which:

Fig. 1 is a view showing a general arrangement of a vector processor;

Fig. 2 is a view for illustrating synchronous communication means in a hithto known multi-processor system;

Fig. 3 is a view showing a FORTRAN program used in conjunction with description of a multi-processor system according to an exemplary embodiment of the invention;

Figs. 4a and 4b are views for illustrating scalar object codes and vector object codes corresponding to the FORTRAN program shown in Fig. 3 employed in the hitherto known multi-processor system;

Figs. 5a and 5b are views for illustrating, respectively, scalar object codes and vector object codes corresponding to the FORTRAN program shown in Fig. 3 to be employed in the multi-processor system according to the invention;

Fig. 6 is a view showing a time chart for illustrating execution of the object codes shown in Figs. 4a and 4b;

Fig. 7 is a view showing a time chart for illustrating execution of the object codes shown in Figs. 5a and 5b; and

Fig. 8 is a view showing a circuit for carrying out the synchronization control according to the invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0018] Before entering into detailed description of an exemplary embodiment of the present invention, an arrangement of a hitherto knwon vector processor will be considered.

[0019] Fig. 1 is a view showing an arrangement of a vector processor such as disclosed in GB-A-2 113 878. In this figure, there are shown those portions of the vector processor which are relevant to the invention. It should further be added that the general arrangement of the vector processor to which the invention can be applied is substantially similar to that shown in Fig. 1 and differs from the latter in the respect that a circuit described later on by reference to Fig. 8 is additionally incorporated. Now, referring to Fig. 1, a reference numeral 1 denotes a main storage, 2 denotes a main storage controller, 3 denotes a scalar processing unit. A numeral 31 denotes a cache or a high-speed buffer memory for storing a map of a segment of the main storage. A numeral 32 denotes a group of registers which may include, for example, sixteen general purpose registers and sixteen floating point registers. A numeral 33 denotes a group of functional units for performing operations in the scalar processing units. A numeral 34 denotes a scalar instruction controller / for performing reading, decoding and controlling the execution of scalar instructions which correspond to those employed in the hitherto known general purpose computer. A numeral 41 denotes a group of registers incorporated in the vector processing unit which may include, for example, a group of vector registers and a group of scalar registers. The group of vector registers may include, for example, thirty-two vector registers each of which may hold vector data consisting of 256 elements, by way of example. The scalar register group may include, for example, thirty-two scalar registers each of which is destined to hold scalar data as in the case of the general purpose register and the floating point register incorporated in the scalar processing unit. A reference numeral 42 denotes a group of vector arithmetic units for processing by pipeline the data read out from the vector register or the scalar register, the results of the processing being stored in the vector register or scalar register. As the vector operation units, there can be mentioned adders and multipliers. A numeral 43 denotes a group of vector address registers used for indicating location of vector data in the main storage when the vector processing unit 4 reads or writes the vector data from or to the main storage 1. The vector address register is composed of a vector base register (VBR) used for holding the base address of the vector data and a vector increment register (VIR) for holding inter-element space of the vector data. A numeral 44 denotes a vector instruction execution controller for reading and decoding vector instruction and controlling the execution thereof.

[0020] Next, description will be made concerning operations of the scalar processing unit and the vector processing unit upon execution of a program.

[0021] For performing the vector processing, preprocessing such as previous loading of such data or values to the vector address registers which are used when the vector data are read out from the main storage is required. In the hitherto known vector processor shown in Fig. 1, the vector processing is executed in accordance with the procedure described below.

PROCEDURE 1

[0022] In precedence to the start of the vector processing, predetermined values requisite for executing the vector processing are loaded in the vector address registers and the scalar registers in the scalar processing unit.

PROCEDURE 2

[0023] Information concerning the base addresses of the main storage where the vector instruction string is stored, the number of elements of the vector data to be processed and the like is sent to the vector processing unit from the scalar processing unit to thereby activate the vector processing unit.

PROCEDURE 3

[0024] The activated vector processing unit reads out and executes the vector instructions sequentially in accordance with information sent from the scalar processing unit to perform the vector processing.

PROCEDURE 4

[0025] After vector processing unit has been activated, the scalar processing unit can perform independently other scalar processing such as, for example, preparation for the succeeding vector processing in parallel with execution of the vector processing by the vector processing unit.

PROCEDURE 5

[0026] Completion or end of execution of the vector processing in the vector processing unit is dealt with by testing the status of the vector processing unit by the scalar processing unit or by issuing an interrupt to the scalar processing unit from the vector processing unit.

[0027] As will be appreciated from the above, the relation between the scalar processing unit and the vector processing unit is such that the former is a master with the latter being a slave, wherein the processing proceeds with in such a manner in which the vector processing unit executes the vector processing under the command issued by the scalar processing unit.

[0028] Fig. 2 shows instructions prepared for allowing synchronous communication between the scalar processing unit and the vector processing unit in the hitherto known processor shown in Fig. 1. All of these instructions are decoded and executed by the scalar processing unit serving as the master processing unit.

[0029] Next, taking as the example a simple FORTRAN-program processing, description will be made in what manner the synchronous communication is carried out between the scalar processing unit and the vector processing unit for proceeding with execution of the processings, while making clear the problems as involved.

[0030] Fig. 3 is a view showing an exmaple of the FORTRAN program, in which a DO-loop including statements indicated by the statement identifying numbers 2 to 6 is processed in the vector processing unit, while the other statements are processed in the scalar processing unit.

[0031] Figs. 4a and 4b are views showing object programs corresponding to the FORTRAN program shown in Fig. 3. The object programs include scalar object codes to be executed by the scalar processing unit and vector object codes to be executed by the vector processing unit, the scalar object codes being shown in Fig. 4a while the vector object codes are shown in Fig. 4b. In the scalar objects shown in Fig. 4a, eleven scalar instructions ID (identification) labelled with S1 to S11, respectively, are for the preparation processing which is executed in precedence to the vector processing. Among them, ten instructions S2 to S11 are used for loading the address information for arrays A, B, C, P and Q in the program shown in Fig. 3 in the vector base register (VBR) and the vector increment register (VIR) incorporated in the vector processing unit. The instruction S1 is used for placing the initial value 0.0 of a variable S contained in the program shown in Fig. 3 to the scalar register provided in the vector processing unit. The scalar instruction ID designated by S12 (hereinafter expressed in the form such as ID-S12) is used for activating the vector processing by informing the later of the addresses of the main storage where the vector objects shown in Fig. 4b are stored (detailed description in this respect is omitted). In response thereto, the vector processing unit executes sequentially the instructions shown in Fig. 4b as vector objects. The scalar instruction ID-S13 is used for testing whether the vector processing unit is in the operating state or in the idle state, the result of this test being reflected to the condition code. (This instruction is referred to as the vector processor test instruction.) When the vector processing unit is in the operating state, this means that execution of the activated vector processing is not completed yet. In this case, a BC instruction (branch-on-condition instruction) designated by S14 is activated to be looped to the instruction ID-S13, whereby the completion of the vector processing is waited for. At the end of the processing performed by the vector processing unit, the results of the summing operation (the variable S in the program shown in Fig. 3) placed in the scalar register at the zeroth address in the vector processing unit is transferred to the floating point register at the zeroth address in the scalar processing unit to be utilized in the succeeding operation (refer to the statement identified by the number 7 in the program shown in Fig. 3).

[0032] The processing performed with the aid of the synchronous communication unit effected between the scalar processing unit and the vector processing unit as described above suffers problems mentioned below.

(1) The address information for all the array data used in the vector processing has to be loaded in the address registers incorporated in the vector processing unit in precedence to the start of execution of the vector processing.
However, in order to execute the vector load instruction corresponding to the instruction V1 in the vector object codes shown in Fig. 4b, it is sufficient that the processing of two instructions S2 and S3 in the scalar objects has been completed. It is unnecessary to load completely all the address information.
Since the scalar processing unit and the vector processing unit can be operated in parallel, it will be obvious that the processing can be carried out with high efficiency, if the scalar instruction for loading the address information can be processed in a proper synchronism with the vector instruction which utilizes the address information.

(2) When the result of computation performed by the vector processing unit is to be referred to by the scalar processing unit, it is necessary for the scalar processing unit to check whether the result has been loaded. However, the scalar processing unit is capable of carrying out only the check as to whether the vector processing unit is operating or in the idle state. Accordingly, in the case of the hitherto known system shown in Figs. 4a and 4b, even when the result of the summing operation for the array A has been determined through execution of the vector instruction V4, the scalar processing unit is not allowed to refer to the result so long as all the vector instructions V5 to V8 have not been completely executed.

[0033] As will be appreciated from th above elucidation, the synchronous communication means effective between the scalar processing unit and the vector processing unit shown in Fig. 2 can neither establish the synchronization nor perform the communication between both the units during the period from the start of the vector processing to the complete end thereof.

[0034] With the present invention, it is contemplated to provide means for controlling finely the synchronization and communication among a plurality of processors bearing the master and slave relation to one another, to thereby realize the parallel processings with enhanced efficiency.

[0035] The hitherto known synchronizing means is so arranged that the master processor activates the processing in the slave processor and checks whether execution of the activated processing has been completed or not in the slave processor. In contrast, according to the present invention, there can be realized the function for temporarily stopping the activation of a new instruction on the side of the slave processor until a certain event occurs in the master processor, the function of issuing an indication of occurrence of a certain event in the slave processor, and the function of testing the indication by the master processor. Further by providing the indication of activation of instruction being temporarily stopped in the slave processor as well as indication of the occurrence of event in the slave processor in program status words (PSW) on the side of the slave processor, synchronizing means can be implemented in a convenient manner. Besides, at the time of the task switching, the synchronization control information is also stored by recovering the PSW from the saved state.

[0036] Now, the invention will be described in detail in conjunction with exemplary embodiments thereof. The processor system referred to in the following description of the embodiment is same as the one shown in Fig. 1. In the processor system shown in Fig. 1, the scalar processing unit corresponds to the master processor, and the slave processing unit corresponds to the vector processor.

[0037] According to the illustrated embodiment, a vector processing program status word (hereinafter referred to as VPPSW in abbreviation) is added with two information bits, while the instructions to be dealt with by the scalar processing unit are added with two instructions while those executed by the vector processing unit being added with two instructions. Accordingly, the VPPSW and the added instructions will be briefly explained, being followed by the description of an example of the synchronization control with the aid of the status word and the instructions.

[0038] The program status word or PSW is used in most of the conventional processors for holding concentratedly the important information concerning the operation state of the processor, the address of the succeeding instruction and others. In this connection, it is noted that in many instances, the PSW usually includes unused bits. Accordingly, in many cases, these unused bits may be used for the two bits which are added according to the teaching of the invention. In the case of the vector processing unit shown in Fig. 1, the PSW is present. The PSW for the vector processing unit is referred to as VPPSW. Since the details of the format for the VPPSW bears no direct relation to the present invention, description thereof will be unnecessary. According to the teaching of the invention, the VPPSW is added with two bits mentioned below.

(1) Pause Bit (referred to as P-bit in abbreviation)
When this bit is "1", initiation of the processing of a new instruction is temporarily stopped. When this bit becomes "0", the temporary stop or pause state is cleared. It should be mentioned that this bit has only the function to inhibit temporarily the start of execution of a new instruction and exerts no influence to the instruction of which execution has been started as well as the instruction being executed.

(2) Signal Bit (referred to as S-bit in abbreviation)
This bit assumes "1" when the processing of an instruction has been completed in the designated vector processing unit. Usage of this bit will be described later on.

[0039] Next, two instructions additionally employed in the vector processing unit will be elucidated below.

(1) VP Pause Instruction (referred to as VPPAU instruction)
When this instruction is executed, the P-bit of the VPPSW (vector processing program status word) assumes "1", whereupon the processing for issuing the instruction succeeding to this VPPAU instruction is stopped temporarily.

(2) Vector Signal Instruction (referred to as VSIG instruction)
Upon completion of the processing of an instruction succeeding to this VSIG instruction, the S-bit of the VPPSW assumes "1".

[0040] Next, two instructions additionally employed in the scalar processing unit will be described.

(1) Resume Vector Processing Instruction (referred to as RSMVP)
This instruction is for testing the value of the P-bit of the PSW. When the value of the P-bit is "1", the latter is reset to "0", whereupon activation of instruction in the vector processing unit is released from the pause state (temporary stop state). When the Value of P-bit is "0", the releasing of the vector processing unit from the pause is inhibited until the P-bit assumes "1". Processing of this instruction is terminated by resetting the P-bit to "0" from "1".

(2) Test and Reset B-Bit Instruction (referred to as TRB)
With this instruction, the S-bit of the VPPSW is tested. When the value of S-bit is "1", the processing of this TRB instruction is terminated by resetting the S-bit to "0". When the S-bit assumes the "0", execution of this TRB instruction is inhibited until the S-bit assumes the value "1". Then, by resetting the S-bit to "0", execution of this instruction is terminated.

[0041] Next, description will be made in what manner the FORTRAN program illustrated in Fig. 3 is processed by making use of the synchronization means described above.

[0042] Figs. 5a and 5b illustrate object codes used in connection with the FORTRAN program shown in Fig. 3 when the synchronization means described above is employed according to the invention. The object codes include the scalar object codes and the vector object code as in the case of the hitherto known object code system illustrated in Figs. 4a and 4b. As will be seen from comparison of Figs. 4a and 4b with Figs. 5a and 5b, the object codes employed in association with the synchronization control means according to the invention bear close resemblance to those known heretofore. In particular, those instructions are utterly same which have same scalar instructions ID and same vector instructions ID added at the left to the individual instructions. In the case of the scalar object codes illustrated in Fig. 5a, the RSMVP instruction for the scalar instruction ID-S101, the RSMVP instruction for the scalar instruction ID-S103 and the TRB instruction for the scalar instruction ID-S102 are added, while the TVP instruction and the BC instruction for the scalar instructions ID-S13 and ID-S14, respectively, are deleted. On the other hand, in the case of the vector object codes shown in Fig. 5b, the VPPAU instruction for the vector instruction ID-V101, the VSIG instruction for the vector instruction ID-V102 and the VPPAU instruction for the vector instruction ID-V103 are added. In the following, functions realized by these added instructions will be described in detail.

(1) EXVP instruction for the scalar instruction ID-S12
This instruction is not issued after completion of all the vector processing preparations such as data loading in the address registers and others, but issued at the time point when the setting of the address information for the array B has been completed and when execution of the instruction VL for the vector instruction ID-V1 can be initiated.

(2) RSMVP instruction for the scalar instruction ID-S101
This instruction commands clearing of the temporary pause of instruction activation in the vector processing unit at the time point when the setting of address information for the array C has been completed in succession to the activation of vector processing in response to the EXVP instruction for the scalar instruction ID-S12. In the case of the vector object codes, the VPPAU instruction for the vector instruction ID-V101 is issued in precedence to the VL instruction for the vector instruction ID-V2 which uses the address information of array C, whereby the initiation of execution of the VL instruction for the vector instruction ID-V2 is temporarily stopped. The RSMVP instruction for the scalar instruction ID corresponds to the VPPAU instruction for the vector instruction ID V101 and functions to clear the temporary stop of activation of the VL instruction for the vector instruction ID-V2 upon completed setting of the address information for the array C.
It should be noted that regardless of which of the PSMVP instruction for the scalar instruction ID-S101 and the VPPAU instruction for the vector instruction ID-V101 is executed in precedence, there arises no problem since the scalar processing unit and the vector processing unit operate in parallel independent of each other. More specifically, when the VPPAU instruction for the vector instruction ID-V101 is executed in precedence, the vector processing unit is set to the stand-by state until the RSMVP instruction for the scalar instruction ID-S101 is issued in the scalar processing unit. In reverse, when the RSMVP instruction for the scalar instruction ID-S101 is executed earlier, the scalar processing unit is set to the stand-by state until the VPPAU instruction for the vector instruction ID-V101 is issued in the vector processing unit.

(3) RSMVP instruction for the scalar instruction ID-S103
This instruction commands the clearing of the temporary stop or pause of instruction activation in the vector processing unit at the time point when the setting of address information for the arrays A, Q and P has been completed after the RSMVP instruction for the scalar instruction ID-S101 was issued. In the case of the vector object codes, the VPPAU instruction for the vector instruction ID-V103 is issued in precedence to the VST instruction ID-V5 which uses the address information for the array A, whereby initiation of execution of the VST instruction for the vector instruction ID-V5 is temporarily stopped. The RSMVP instruction for the scalar instruction ID-S103 corresponds to the VPPAU instruction for the vector ID-V103 and serves to clear the temporary stop of instruction succeeding to the vector instruction ID-V5.
It should be mentioned here that several RSMVP instruction may be inserted between the RSMVP instruction for the scalar instruction ID-S101 and the RSMVP instruction for the scalar instruction ID-S103 (with corresponding VPPAU instructions being inserted between the vector object codes) for realizing more fine synchronization between the scalar processing unit and the vector processing unit. However, in view of the fact that processing of the vector instructions V2, V3 and V4 which require longer time for execution when compared with the scalar instruction is started in response to the clearing of the pause in activation of the vector instructions by the RSMVP instruction for the scalar instruction ID-S101, it is considered that enough time is available for processing the scalar instructions S6 to S11 in the meantime. Accordingly, arrangement is adopted such as illustrated in Fig. 5a.

(4) TAB instruction for scalar instruction ID-S102
With this TAB instruction, it is waited until indication of completion of writing of the results of the vector summing operation in the scalar register at the zeroth address is made in order to allow the result of the vector summing operation as placed in the scalar registered at the zeroth address to be referred to after execution of the MVFS instruction for the scalar instruction ID-S15. This TAB instruction corresponds to the VSIG instruction for the vector instruction ID-V102 among the vector object codes. Indication for completion of the writing operation is made with the aid of S-bit of the VPPSW instruction, as described hereinbefore.

(5) VPPAU instruction for vector instruction ID-V101
This instruction serves to stop temporarily the processing for activating the vector instructions until the address information for the array C has been set in the VBR and VIR at the respective zeroth addresses, for allowing the execution of the VL instruction for the vector instruction ID-V2. This temporary stop of activation is indicated by setting the P-bit by the VPPAU instruction to "1". This instruction VPPAU corresponds to the RSMVP instruction for the scalar instruction ID-S101 among the scalar object codes.

(6) VSIG instruction for vector instruction ID-V102
This instruction VSIG serves to set the S-bit of the instruction VPPSW upon completed execution of the VSM instruction (vector summing operation) for the vector instruction ID-V4 which succeeds to this VSIG instruction. Due to provision of this instruction, the result of execution of the VSLM instruction can be referred to as early as possible without being subjected to the influence of the other instructions.

(7) VPPAU instruction for vector instruction ID-V103
This VPPAU instruction serves to stop temporarily activation of the vector instruction until address information for the arrays A, Q and P has been completely set, for allowing the VST instruction or VL instruction succeeding to the vector instruction ID-V103 to be executed. This VPPAU instruction corresponds to the RSMVP instruction for the scalar instruction ID-S103 among the scalar objects.

[0043] Now, description will be made concerning the efficiency attained with the object codes employed in association with the synchronization control means according to the invention (illustrated in Figs. 5a and 5b) in comparison with the hitherto known object codes illustrated in Figs. 4a and 4b with the aid of time charts.

[0044] Fig. 6 shows a time chart corresponding to the hitherto known object codes illustrated in Figs. 4a and 4b, and Fig. 7 shows a time chart corresponding to the object codes utilizing the synchronization control means according to the present invention (shown in Figs. 5a and 5b). In the time charts shown in Figs. 6 and 7, the order or sequence in which instructions are decoded or issued is taken along the ordinate while the time base is taken along the abscissa in terms of the number of machine cycles. The order in which instructions are decoded is shown in terms of the scalar instructions ID and the vector instructions ID illustrated in Figs. 4a and 4b or Figs. 5a and 5b, wherein an upper half is allocated to the scalar object codes with a lower half allocated to the vector object codes. Preparation of the time charts is based on the assumption mentioned below.

(1) The pipeline processing pitch in the vector processing unit is one cycle.

(2) Although the time taken for the first data to pass through the pipe (often referred to as travel time) in the vector processing varies in dependence on the types of operations, it is assumed that the travel time is ten cycles uniformly.

(3) The number of times the DC-loop in the FORTRAN program shown in Fig. 3 is executed is one hundred. With a single vector operation, one hundred elements are processed. Accordingly, from the assumptions (1) and (2), the time taken for processing one vector instruction amounts to 110 cycles in total which is a sum of 10 cycles taken for obtaining the first result and subsequent 100 cycles taken for obtaining successively one hundred results over 100 cycles.

(4) The time taken for executing the scalar instruction is assumed to be two cycles uniformly.

(5) The time pitch in decoding the scalar instruction or the vector instruction as well as the time pitch in issuing these instructions, respectively, is assumed to be two cycles.

(6) Many of the vector processors adopt speeding-up technique referred to as the chaining. For the details of this technique, reference may be made GB-A-2 113 878 Among the individual instructions of the vector objects shown in Figs. 4b and 5b, the chaining can be realized between the vector instruction ID-V1 or ID-V2 and the vector instruction ID-V3, between the vector instructions ID-V6 and ID-V7, and between the vector instructions ID-V7 and ID-V8.

[0045] Although the assumptions enumerated above do not reflect the actual parameters in concern of the processor with accuracy, they reflect the characteristics of the vector processor in general and are reasonably adequate for explaining the effects accomplished with the present invention.

[0046] From the comparison of the time chart shown in Fig. 6 with the one shown in Fig. 7, the following differences can be seen.

(1) The total processing time is 272 cycles in the case of the conventional technique illustrated in Fig. 6. In contrast, according to the invention, the processing time is shortened to 248 cycles as can be seen in Fig. 7. This difference in the processing time can be explained by the fact that in contrast to the prior art technique in which the vector processing can be initiated only after all the preparatory processings for the vector processing have been completed, the present invention allows the vector processing to be initiated at an earlier time-point when only a part of the preparatory processing for the vector processing has been completed.

(2) According to the prior art technique illustrated in Fig. 6, the scalar processing unit must wait for completion of the vector processing for a period which amounts to as long as 239 cycles. This period includes the time taken for processing the vector instructions V5 to V8 for which the scalar processing unit need not await the completion of processing in actuality. In contrast, the corresponding stand-by time of the scalar processing unit is reduced down to 105 cycles in the case of the technique illustrated in Fig. 7.

(3) The result of the summing operation performed by the vector processing unit can be derived at the 272-nd cycle by executing the scalar instruction ID-S15 by the scalar processing unit in the case of the technique illustrated in Fig. 6. In contrast, according to the invention, the corresponding result can be extracted as early as at 140-th cycle.

[0047] Next, control logic for implementing the synchronization control means according to the present invention will be described. The control logic is not of a large scale but can be realized by some logic circuits added to the scalar instruction controller 34 and the vector instruction execution controller 44 in the vector processor shown in Fig. 1.

[0048] Fig. 8 is a view showing the control logic configuration for implementing the synchronization control means according to the present invention. In Fig. 8, reference numerals 34 and 44 denote a scalar instruction controller and a vector instruction execution controller which are equivalent to those shown in Fig. 1, respectively. At first, description will be directed to the internal structure of the scalar instruction controller 34. A numeral 301 denotes a scalar instruction register for fetching and holding the incoming scalar instruction transmitted over a signal line 310. A numeral 302 denotes a scalar instruction decoding circuit for decoding the scalar instruction held by the scalar instruction register 301. A numeral 303 denotes a scalar instruction activating logic for supplying an activating signal on the basis of decoded result of the scalar instruction decoding circuit 302 to the functional unit, registers and others participating in the processing of the instruction through a group of signal lines 318. A signal of logic "1" makes appearance on the signal line 311 when the PSMVP instruction is decoded, while a signal of logic "1" is produced on the signal line 312 upon decoding of the TRS instruction. The portion performing the instruction executing processing inclusive of the processing mentioned above on the basis of the decoded information supplied from the instruction decoder circuit 302 constitutes an executing portion.

[0049] Next, the internal structure of the vector instruction execution controller 44 will be described. A reference numeral 401 denotes a vector instruction register for fetching and holding the incoming vector instruction transmitted over a signal line 410. A numeral 402 denotes a vector instruction decoding circuit for decoding the vector instruction held in the vector instruction register 401. A numeral 403 denotes a vector instruction activation deciding logic having functions mentioned below. The logic 403 centrally manages the states of the use of the functional units, vector registers and others in the vector processing unit, checks the decoded information of instruction inclusive of information identifying the vector register used for executing the instruction fed from the vector instruction decoding circuit 402 as well as other information, and makes decision as to whether the vector instruction in concern can be activated or not. When the decision results in that the activation is permitted, an activation signal is produced to be supplied through the signal lines 411 to the functional unit, the vector registers and others which participate in the processing of the above mentioned instruction. The signal lines 412 serve to transmit a message that the vector instruction under execution has been ended. Information including the data identifying the vector register used for executing the instruction is transmitted through these signal lines 412 to be inputted to the vector instruction activation deciding logic 403, whereby the information concerning the functional unit, vector register and others used in executing the completed vector instruction is altered. There is produced on the signal line 413 a signal of logic "1" upon decoding of the VPPAU instruction, while the signal of logic "1" is produced on the signal line 414 upon decoding of the VSIG instruction. Numerals 450 and 451 denote registers, respectively. The register 450 is set when the VSIG instruction is decoded to thereby produce logic "1" on the signal line 414. The register 450 in the ON state commands that the information outputted from the vector instruction decoding circuit 402 concerning the instruction decoded subsequently is to be placed in the register 451. Thus, the information concerning the instruction succeeding to the VSIG instruction is held by the register 451. A numeral 452 denotes a comparison circuit for comparing the information concerning the instruction succeeding to the VSIG instruction and held by the register 451 with the information supplied through the signal line 412 concerning the instruction of which execution has been completed. When the information supplied through the signal lines 412 concerns the instruction held in the register 451, the comparison circuit 452 produces logic "1" on the signal line 415. More specifically, upon completion of execution of the instruction succeeding to the VSIG instruction, the logic "1" signal is produced on the signal line 415 to message this fact. A numeral 453 denotes a vector program status word (VPPSW in abbreviation) for holding the status of the program being processed in the vector processing unit. According to the invention, the P-bit and S-bit mentioned hereinbefore are added. The outputted P-bit is inputted to the vector instruction activation deciding logic 403. When the P-bit is "1", this is informed to the vector instruction activation deciding logic 403, whereby activation of the vector instruction is inhibited. A numeral 454 denotes a P-bit registration control logic. The portion for performing the instruction executing processing inclusive of the above mentioned processing on the basis of the decoded information transferred from the instruction decoder circuit 402 constitutes an executing section.

[0050] Next, functions of the control logics described above will be elucidated in conjunction with the processing of the scalar instructions PSMVP and TRS as well as the vector instructions VPPAU and VSIG.

(1) Function of the P-bit registration control logic 454
When the VPPAU instruction is decoded by the vector processing unit, this is informed to the P-bit registration control logic 454 by way of the signal line 413. In response, the P-bit is set to "1".
When the PSMVP instruction is decoded by the scalar processing unit, this is informed to the logic 454 through the signal line 311. If the P-bit has the value of "1" at that time, the P-bit is reset to "0". In that case, the signal line 416 is held in the OFF state. Consequently, the subsequent instruction is not suspended from activation in the instruction activating logic 303. The instruction activation deciding logic 403 releases the vector instruction from the activation inhibiting state in response to the resetting of the P-bit to "0". On the other hand, when the value of the P-bit is "0", the signal line 416 is set to the ON state, and this is informed to the scalar instruction controller 303 for suspending temporarily the activation of the scalar instruction (i.e. the RSMVP instruction). Subsequently, when the VPPAU instruction is decoded in the vector processing unit and the corresponding message is issued, the signal line 416 is set to the OFF state to release the scalar instruction from the suspended state. At this time, the value of P-bit is left to be "0".
When both the RSMVP instruction and the VPPAU instruction are simultaneously decoded, the P-bit is reset to "0". Neither the scalar instruction activating logic 303 nor the vector instruction activation deciding logic 403 suspends the instruction from activation.

(2) Function of the S-bit registration logic 455
Upon completion of the processing of the instruction succeeding to the VSIG instruction having been processed in the vector processing unit, this is informed to the registration control logic 455 by way of the signal line 415. In response, the registration control logic 455 sets the S-bit to "1". When the TRS instruction is decoded in the scalar processing unit, corresponding information is given to the logic 455 through the signal line 312. When the value of S-bit is "1" at that time, the S-bit is reset to "0". In that case, the signal line 417 is held in the OFF state, whereby the activation of the succeeding instruction is protected from being suspended in the scalar instruction activation logic 303. On the other hand, the vector instruction activation deciding logic 403 performs operation for activating the succeeding instruction regardless of the value assumed by the S-bit. When the value of S-bit is "0", the signal line 417 is turned on, and this status is informed to the scalar instruction controller 303 to thereby temporarily suspend the activation of the scalar instruction (i.e. the TRS instruction). Subsequently, the processing of the instruction succeeding to the VSIG instruction comes to end, in response to which the signal line 417 is set to the OFF state, whereby the scalar instruction is released from the activation suspended state. At this time, the value of S-bit is left to be "0".
When the TRS instruction occurs simultaneously with the completion of execution of the instruction succeeding,to the DSIG instruction, the S-bit is reset to "0". In the scalar instruction activating logic 303, the activation of the succeeding instruction is prevented from being suspended.

[0051] With the circuit arrangement described above with reference to Fig. 8, the concept of synchronization control according to the invention can be realized.

[0052] In the case of the illustrated embodiments, discussion has been made in conjunction with the vector processor. However, it should be understood that the application of the present invention is never restricted to the vector processor, but can be applied to synchronization control between processors of any types so far as they are in the master-slave relation, such as between the scalar processing unit and an array processing unit, by way of example.

Claims

1. A multi-processor system comprising a main storage (1) for storing instructions and data, a master processor (3) for supplying to a slave processor (4) data required for the processing to be executed by said slave processor and having a function to test operation state of said slave processor, and at least one slave processor (4) for initiating processing in response to a command issued by said master processor (3) and having a function to inform said master processor of completion of the processing:
said slave processor (4) including: an instruction register (401) for holding an instruction read out from said main storage (1); decoding means (402) for decoding the instruction held by said instruction register (401); executing means (41, 42) for executing the instruction in accordance with the result of the decoding performed by said decoding means (402); and indication means (453, 454,) for indicating operation state of said slave processor; wherein said decoding means (402) is provided with means (413) for supplying a decoded output setting at said indication means a pause indication (P) indicating that said slave processor is in the pause state when a pause instruction (VPPAU) commanding said pause is held in said instruction register (401), and wherein said slave processor is inhibited from performing activation processing for a succeeding instruction so long as said pause indication is issued by said indication means; and
said master processor including: an instruction register (301) for holding an instruction read from said main storage (1); decoding means (302) for decoding the instruction held by said instruction register (301); and executing means (32, 33) for executing the instruction in accordance with the result of the decoding performed by said decoding means (302); wherein said decoding means (302) is provided with means (311) for supplying a decoded output when a pause clearing instruction (RSMVP) is held by said instruction register (301), for supplying to said slave processor a command for releasing said slave processor from the pause state, said slave processor including means (454,) for responding to said command for releasing from the pause state supplied from said master processor for resetting said pause indication (P) when a pause indication has been set in said indication means (311, 312), for supplying to said master processor a completion signal indicating that said pause indication is reset and for executing the processing of the subsequent instruction, while supplying to said master processor an incompletion signal when said pause indication is not set, said master processor responding to said incompletion signal to thereby suspend activation of the succeeding instruction until said completion signal is received.

2. A multi-processor system according to Claim 1, wherein said indication means of said slave processor uses a program status word (453) thereof for indication.

3. A multi-processor system comprising a main storage (1) for storing instructions and data, a master processor (3) for supplying to a slave processor (4) data required for the processing to be executed by said slave processor, said master processor (3) having a function to test the operation state of said slave processor (4) and executing processing by utilizing the result of the processing executed by said slave processor, and at least one slave processor (4) for initiating processing in response to a command issued by said master processor, said slave processor having a function to inform said master processor of completion of the processing:
said slave processor (4, 44) including: an instruction register (401) for holding an instruction read out from said main storage (1); decoding means (402) for decoding the instruction held by said instruction register (401); executing means (41, 42) for executing the instruction in accordance with the result of the decoding performed by said decoding means (402); and indication means (453, 455, 417) for indicating an operation state of said slave processor; wherein said decoding means (402) is provided with means (414) for supplying a decoded output when an indication instruction (VSIG) is held by said instruction register (401), said executing means being provided with storage means (450) for storing temporarily an indication command in response to said decoded output and execution completion identifying means (455, 451, 452) enabled by said indication command in said storage means for identifying completion of execution of the instruction succeeding to said indication instruction to thereby set an indication (S) of execution completion in said indication means (453); and
said master processor (3, 34) including an instruction register (301) for holding an instruction read out from said main storage (1); decoding means (302) for decoding the instruction held by said instruction register (301); and executing means (32, 33) for executing the instruction in accordance with the result of the decoding performed by said decoding means; wherein said decoding means (302) is provided with means (312) for supplying a decoded output when an indication resetting instruction (TRS) is held in said instruction register, for sending to said slave processor an indication clearing command , said slave processor being provided with means (455) responsive to said indication clearing command supplied from said master processor for resetting said indication of execution completion (S) when it is set in said indication means (453) to thereby supply to said master processor a completion signal (417) indicating that said indication of execution completion has been reset, while supplying to said master processor an incompletion signal when said indication of execution completion is not set, said master processor being responsive to said incompletion signal for suspending activation of the succeeding instruction until said completed signal is received.

4. A multi-processor system according to Claim 3, wherein said slave processor is a vector processor (4), said execution completion identifying means including second storing means (451) for storing the result of decoding of said succeeding instruction, and comparison means (452) for comparing the content stored in said second storing means with the information concerning the execution completion of said succeeding instruction to thereby identify the execution completion of said succeeding instruction.

5. A multi-processor system according to Claim 3, wherein said indication means of said slave processor uses a program status word (453) thereof for indication.

Revendications

1. Système multiprocesseur comprenant une mémoire principale (1) pour mémoriser des instructions et des données, un processeur maître (3) pour envoyer à un processeur esclave (4) des données requises pour le traitement devant être exécuté par ledit processeur esclave et ayant pour rôle de tester l'état de fonctionnement dudit processeur esclave, et au moins un processeur esclave (4) pour déclencher un traitement en réponse à un ordre délivré par ledit processeur maître (3) et ayant pour rôle d'informer ledit processeur maître de l'achèvement du traitement :
ledit processeur esclave (4) comprenant : un registre d'instruction (401) pour contenir une instruction lue à partir de ladite mémoire principale (1); des moyens de décodage (402) pour décoder l'instruction contenue dans ledit registre d'instruction (401); des moyens d'exécution (41,42) pour exécuter l'instruction en fonction du résultat du décodage exécuté par lesdits moyens de décodage (402); et des moyens indicateurs (453,454) pour indiquer l'état de fonctionnement dudit processeur esclave; lesdits moyens de décodage (402) étant équipés de moyens (413) pour l'envoi d'un signal de sortie décodé réglant, dans lesdits moyens indicateurs, une indication de pause (P) indiquant que ledit processeur esclave est dans l'état de pause, lorsqu'une instruction de pause (VPPAU) commandant ladite pause est contenue dans ledit registre d'instruction (401), et ledit processeur esclave étant empêché d'exécuter le traitement d'activation pour une instruction suivante tant que ladite indication de pause est délivrée par lesdits moyens indicateurs; et
ledit processeur maître incluant : un registre d'instruction (301) pour contenir une instruction lue à partir de ladite mémoire principale (1); des moyens de décodage (302) pour décoder l'instruction contenue par ledit registre d'instructions (301); et des moyens d'exécution (32,33) pour exécuter l'instruction en fonction du résultat du décodage exécuté par lesdits moyens de décodage (302); lesdits moyens de décodage (302) étant équipés de moyens (311) pour envoyer un signal de sortie décodé lorsqu'une instruction d'effacement de pause (RSMVP) est contenue par ledit registre d'instruction (301), envoyer audit processeur esclave un ordre de libération dudit processeur esclave à partir de l'état de pause, ledit processeur esclave comprenant des moyens (454) destinés à répondre audit ordre de libération à partir de l'état de pause, qui est envoyé par ledit processeur maître, pour annuler ladite indication de pause (P) lorsqu'une indication de pause a été réglée dans lesdits moyens indicateurs (311,312), envoyer audit processeur maître un signal d'achèvement indiquant que ladite indication de pause est supprimée, et exécuter le traitement de l'instruction ultérieure, et envoyer audit processeur maître un signal de non-achèvement lorsque ladite indication de pause n'est pas réglée, ledit processeur maître répondant audit signal de non-achèvement pour interrompre alors l'activation de l'instruction suivante jusqu'à ce que ledit signal d'achèvement soit reçu.

2. Système multiprocesseur selon la revendication 1, dans lequel lesdits moyens indicateurs dudit processeur esclave utilisent, pour l'indication, un mot d'état dé programme (453) de ce processeur.

3. Système multiprocesseur comprenant une mémoire principale (1) pour mémoriser des instructions et des données, un processeur maître (3) pour envoyer au processeur (4) des données requises pour le traitement devant être exécuté par ledit processeur esclave, ledit processeur maître (3) ayant pour rôle de tester l'état de fonctionnement dudit processeur esclave (4) et exécutant le traitement en utilisant le résultat du traitement exécuté par ledit processeur esclave, et au moins un processeur esclave (4) pour déclencher le traitement en réponse à un ordre délivré par ledit processeur maître, ledit processeur esclave ayant pour rôle d'informer ledit processeur maître de l'achèvement du traitement :
ledit processeur esclave (4,44) comprenant : un registre d'instruction (401) pour contenir une instruction lue à partir de ladite mémoire principale (1); des moyens de décodage (402) pour décoder l'instruction contenue par ledit registre d'instructions (401); des moyens d'exécution (41,42) pour exécuter l'instruction en fonction du résultat du décodage exécuté par lesdits moyens de décodage (402); et des moyens indicateurs (453,455,417) pour indiquer un état de fonctionnement dudit processeur esclave; lesdits moyens de décodage (402) étant équipés de moyens (414) pour envoyer un signal de sortie décodé lorsqu'une instruction d'indication (VSIG) est contenue par ledit registre d'instruction (401), lesdits moyens d'exécution étant équipés de moyens de mémoire (450) pour mémoriser temporairement un ordre d'indication en réponse audit signal de sortie décodé; et des moyens (455,451,452) d'identification de l'achèvement de l'exécution, validés par ledit ordre d'indication dans lesdits moyens de mémoire pour identifier l'achèvement de l'exécution de l'instruction succédant à ladite instruction d'indication pour régler de ce fait une indication (S) de l'achèvement de l'exécution dans lesdits moyens indicateurs (453); et
ledit processeur maître (3,34) comprenant un registre d'instruction (301) pour contenir une instruction lue à partir de ladite mémoire principale (1); les moyens de décodage (302) pour décoder l'instruction contenue par ledit registre d'instruction (301); et des moyens d'exécution (32,33) pour exécuter l'instruction en fonction du résultat du décodage exécuté par lesdits moyens de décodage; lesdits moyens de décodage (302) comportant des moyens (312) pour envoyer un signal de sortie décodé lorsqu'une instruction (TRS) de suppression de l'indication est contenue dans ledit registre d'instructions, pour envoyer audit processeur esclave un ordre d'effacement de l'indication, ledit processeur esclave étant équipé de moyens (455) répondant audit ordre d'effacement de l'indication, envoyé par ledit processeur maître pour annuler ladite indication d'achèvement d'exécution (S) lorsque cette indication est réglée dans lesdits moyens indicateurs (453) pour envoyer de ce fait audit processeur maître un signal d'achèvement (417) indiquant que ladite indication d'achèvement de l'exécution a été supprimée, et pour envoyer audit processeur maître un signal de non-achèvement lorsque ladite indication d'achèvement de l'exécution n'est pas réglée, ledit processeur maître étant apte à répondre audit signal de non-achèvement pour interrompre l'activation de l'instruction suivante jusqu'à ce que ledit signal d'achèvement soit reçu.

4. Système multiprocesseur selon la revendication 3, dans lequel ledit processeur esclave est un processeur vectoriel (4), lesdits moyens d'identification de l'achèvement de l'exécution comprenant des seconds moyens de mémoire (451) servant à mémoriser le résultat du décodage de ladite instruction suivante, et des moyens comparateurs (452) servant à comparer le contenu mémorisé dans lesdits seconds moyens de mémoire à l'information concernant l'achèvement de l'exécution de ladite instruction suivante de manière à identifier l'achèvement de l'exécution de ladite instruction suivante.

5. Système multiprocesseur selon la revendication 3, dans lequel lesdits moyens indicateurs dudit processeur esclave utilisent pour l'indication un mot d'état de programme (453).

Ansprüche

1. Multi-Prozessorsystem mit einem Hauptspeicher (1) zum Speichern von Befehlen und Daten, einem Master-Prozessor (3) zum Liefern von Daten an einen Slave-Prozessor (4), die für die von diesem durchzuführende Bearbeitung benötigt werden, wobei der Master-Prozessor (3) eine Testfunktion für den Betriebsstatus des Slave-Prozessors aufweist, und mit mindestens einem Slave-Prozessor (4) zum Einleiten der Verarbeitung in Antwort auf einen von dem genannten Master-Prozessor (3) gegebenen Befehl, wobei der Slave-Prozessor eine Funktion zur Benachrichtigung des Master-Prozessors von der Beendigung der Verarbeitung aufweist,
wobei der Slave-Prozessor (4) umfaßt:
ein Befehlsregister (401) zur Speicherung eines aus dem Hauptspeicher (1) ausgelesenen Befehls;
einen Dekoder (402) zur Dekodierung des in dem Befehlsregister (401) gespeicherten Befehls;
eine Ausführungseinrichtung (41, 42) zur Ausführung des Befehls entsprechend dem Ergebnis der von dem Dekoder (402) durchgeführten Dekodierung; und
eine Indikator-Einrichtung (453, 454) zur Angabe des Betriebsstatus des Slave-Prozessors;
wobei der Dekoder (402) mit einer Einrichtung (413) zur Lieferung eines dekodierten Ausgangssignals versehen ist, das, wenn ein eine Pause angebender Pause-Befehl (VPPAU) im Befehlsregister (401) gespeichert ist, in der genannten Indikator-Einrichtung einen Pause-Indikator (P) setzt, der anzeigt, daß sich der Slave-Prozessor im Pause-Status befindet, und
wobei der Slave-Prozessor so lange, wie von der Indikator-Einrichtung der Pause-Indikator ausgegeben wird, ist die Einleitungs-Bearbeitung für einen folgenden Befehl durchzuführen; und
wobei der Master-Prozessor umfaßt:
ein Befehlsregister (301) zum Speichern eines aus dem Hauptspeicher (1) ausgelesenen Befehls;
einen Dekoder (302) zur Dekodierung des in dem Befehlsregister (301) gespeicherten Befehls; und
eine Ausführungseinrichtung (32, 33) zur Ausführung des Befehls entsprechend dem Ergebnis der von dem Dekoder (302) durchgeführten Dekodierung;
wobei der Dekoder (302) mit einer Einrichtung (311) zur Lieferung eines dekodierten Ausgangssignals, wenn ein Pause-Lösch-Befehl (RSMVP) in dem Befehlsregister (301) gespeichert ist, versehen ist, um dem Slave-Prozessor einen Befehl zur Freigabe des Slave-Prozessors aus dem Pause-Status zu liefern;
wobei der Slave-Prozessor eine Einrichtung (454) zur Beantwortung des von dem Master-Prozessor gelieferten Befehls zur Freigabe aus dem Pausen-Status aufweist, um, wenn ein Pause-Indikator in der Indikator-Einrichtung (311, 312) gesetzt ist, diesen Pause-Indikator (P) zurückzusetzen, um dem Master-Prozessor ein Abschlußsignal zu liefern, das angibt, daß der Pause-Indikator zurückgesetzt ist, und um, wenn der PauseIndikator nicht gesetzt ist, die Bearbeitung des folgenden Befehls durchzuführen, während dem Master-Prozessor ein Nicht-Abschlußsignal geliefert wird,
wobei der Master-Prozessor auf das Nicht-Abschlußsignal durch Aussetzen der Einleitung des folgenden Befehls bis zum Empfang des Abschlußsignals antwortet.

2. Multi-Prozessorsystem nach Anspruch 1, wobei die Indikator-Einrichtung des Slave-Prozessors als Indikator ein Programm-Status-Wort (453) verwendet.

3. Ein Multi-Prozessorsystem mit einem Hauptspeicher (1) zur Speicherung von Befehlen und Daten, einem Master-Prozessor (3) zur Lieferung von Daten an einen Slave-Prozessor (4), die für die von diesem durchzuführende Verarbeitung benötigt werden, wobei der Master-Prozessor (3) eine Testfunktion für den Betriebsstatus des Slave-Prozessors (4) aufweist und eine Verarbeitung unter Verwendung des Ergebnisses der von dem Slave-Prozessor durchgeführten Verarbeitung durchführt, und mit mindestens einem Slave-Prozessor (4) zur Einleitung einer Verarbeitung in Antwort auf einen von dem Master-Prozessor gegebenen Befehl, wobei der Slave-Prozessor eine Funktion zur Benachrichtigung des Master-Prozessors vom Abschluß der Verarbeitung aufweist,
wobei der Slave-Prozessor (4, 44) umfaßt:
ein Befehlsregister (401) zur Speicherung eines aus dem Hauptspeicher (1) ausgelesenen Befehls;
einen Dekoder (402) zur Dekodierung des im Befehlsregister (401) gespeicherten Befehls;
eine Ausführungseinrichtung (41, 42) zur Ausführung des Befehls entsprechend dem Ergebnis der von dem Dekoder (402) durchgeführten Dekodierung; und
eine Indikator-Einrichtung (453, 455, 417) zur Angabe eines Betriebsstatus des Slave-Prozessors;
wobei der Dekoder (402) mit einer Einrichtung (414) zur Lieferung eines dekodierten Ausgangssignals, wenn ein Indikator-Befehl (VSIG) im Befehlsregister (401) gespeichert ist, versehen ist,
wobei die Ausführungseinrichtung mit einer Speichereinrichtung (450) zur vorübergehenden Speicherung eines Indikator-Befehls in Antwort auf das dekodierte Ausgangssignal, und einer Ausführungs-Abschluß-Erkennungseinrichtung (451, 452, 455) versehen ist, die von dem Indikator-Befehl in der genannten Speichereinrichtung freigegeben wird, um den Abschluß der Ausführung des auf den Indikator-Befehl folgenden Befehls zu erkennen und einen Indikator (S) für den Abschluß einer Ausführung in der Indikator-Einrichtung (453) zu setzen; und
wobei der Master-Prozessor (3, 34) umfaßt:
ein Befehlsregister (301) zur Speicherung eines aus dem Hauptspeicher (1) gelesenen Befehls;
einen Dekoder (302) zur Dekodierung des in dem Befehlsregister (301) gespeicherten Befehls; und
eine Ausführungseinrichtung (32, 33) zur Ausführung des Befehls entsprechend dem Ergebnis der von dem Dekoder durchgeführten Dekodierung;
wobei der Dekoder (302) mit einer Einrichtung (312) zur Lieferung eines dekodierten Ausgangssignals, wenn ein Indikator-Rücksetz-Befehl (TRS) im Befehlsregister gespeichert ist, versehen ist, um dem Slave-Prozessor ein Indikator-Lösch-Befehl zu senden,
wobei der Slave-Prozessor mit einer auf den genannten von dem Master-Prozessor gelieferten Indikator-Lösch-Befehl ansprechenden Einrichtung (455) versehen ist, wenn der Indikator (S) für den Abschluß der Ausführung in der Indikator-Einrichtung (453) gesetzt ist, diesen zurückzusetzen, um dadurch dem Master-Prozessor ein Abschlußsignal (417) zu liefern, das anzeigt, daß der Indikator für den Abschluß der Ausführung zurückgesetzt worden ist, und um, wenn der Indikator für den Abschluß der Ausführung nicht gesetzt ist, dem Master-Prozessor ein Nicht-Abschluß-Signal zu liefern, wobei der Master-Prozessor auf das Nicht-Abschluß-Signal durch Aussetzung der Einleitung des folgenden Befehls bis das Abschluß-Signal empfangen ist antwortet.

4. Multi-Prozessorsystem nach Anspruch 3, wobei der Slave-Prozessor ein Vektor-Prozessor (4) ist, und die Ausführungs-Abschluß-Erkennungseinrichtung eine zweite Speichereinrichtung (451) zur Speicherung des Ergebnisses der Dekodierung des folgenden Befehls und eine Vergleichseinrichtung (452) zum Vergleich des in der zweiten Speichereinrichtung gespeicherten Inhalts mit der den Ausführungsabschluß des folgenden Befehls betreffenden Information aufweist, um dadurch den Ausführungsabschluß des folgenden Befehls zu erkennen.

5. Multi-Prozessorsystem nach Anspruch 3, wobei die Indikator-Einrichtung des Slave-Prozessors als Indikator ein Programm-Status-Wort (453) des Slave-Prozessors verwendet.

Drawing