BACKGROUND OF THE INVENTION
TECHNICAL FIELD
[0001] This invention relates in general to multi-processor architectures and, more particularly,
to a shared translation lookaside buffer in a multi-processor architecture.
DESCRIPTION OF THE RELATED ART
[0002] Many new electronic devices make use of a multi-processor environment that includes
DSPs (digital signal processors), MPUs (microprocessor units), DMA (direct memory
access units) processors, and shared memories.
[0003] The types of tasks performed by a device often have specific real time constraints
due to the signals that they are processing. For example, DSPs are commonly used in
devices where video and audio processing and voice recognition are supported. These
functions can be significantly degraded if part of the multi-processor system must
suspend processing while waiting for an event to occur. Performing memory address
translations from virtual address used by a task to physical addresses necessary to
access the physical memory can be time consuming and degrade performance for a real-time
task. To reduce the latencies caused by address translation, a TLB (translation lookaside
buffer) is commonly provided as part of a MMU (memory management unit). The translation
lookaside buffer caches recently accessed memory locations. At the beginning of a
memory access, the TLB is accessed. When a TLB (translation lookaside buffer) cache
does not contain the information corresponding to the current access (i.e., a TLB-"miss"
or "page fault".), the information must be retrieved from tables ("table walking"),
located in main memory. This operation takes tens to hundreds of microprocessor cycles.
While the MMU is walking the tables, the operation of the core is blocked, resulting
in degraded or errant performance of the processor.
[0004] In a multiprocessor system, several separate processing devices may be performing
virtual address translation in order to access the physical memory. In one solution
shown in Figure 1, of the type used by the ARM MEMC2, a multiprocessor device 10 uses
a shared TLB 12 accessible by multiple processing devices 14 (individually referenced
as processing devices 14a-c). Each processing device 14 has a unique requester identifier
that is concatenated to a virtual address to form a modified virtual address. The
concatenation is performed in order to present unique virtual addresses to the shared
TLB 12, since the virtual address range used by the various processors that access
the shared TLB 12 may otherwise overlap, thereby presenting a possibility that the
wrong physical address may be retrieved from the TLB 12.
[0005] If there is a miss in the shared TLB 12, the virtual address is translated to a physical
address through translation tables 16 (individually referenced as translation tables
16a-c in Figure 1) in the physical memory 18. The requester identifier in the concatenated
address provides a starting base address in the external memory's translation table
section. Thus, each potential requester has its own translation table under this approach,
which is an inefficient use of memory and provides no flexibility in translating a
virtual address.
[0006] United States Patent No. 4481573 describes a virtual storage data processing system having an address translation
unit shared by a plurality of processors, located in a memory control unit connected
to a main memory. One of processors is a job processor which accesses the main memory
with a virtual address to execute an instruction and includes a cache memory which
is accessed with a virtual address. Another processor is a file processor which accesses
the main memory with a virtual address to transfer data between the main memory and
an external memory. The cache memory receives the virtual address when the file processor
writes to the main memory and if it contains a data block corresponding to the virtual
address, it invalidates the corresponding data block. The address translation unit
translates the address differently for the access from the file processor and the
accesses from other processors
[0007] European Patent Application No. 0215544 describes a hybrid software and hardware mechanism for updating a TLB. The TLB is
segmented into two hardware portions: a high speed primary TLB located at or near
the processor, and a slower secondary TLB located within the processor or in physically
addressed main memory. Virtual address translation is first attempted by the processor
via the primary TLB, and if the virtual address translation is not contained within
the primary TLB, hardware is used to fetch a new TLB entry from the secondary TLB.
If the virtual address translation is not contained within the secondary TLB, a processor
trap occurs to interrupt or halt the operation of the processor and software located
within the processor is used to transfer the desired entry to the secondary TLB or
to both the primary and secondary TLB.
[0008] European Patent Application No. 0382237 describes a multiprocessing system comprises a single translation lookaside buffer
partitioned into a plurality of buffer areas, and a plurality of directory registers
respectively corresponding to the buffer areas. Each register is partitioned into
a first field and a second field having bit positions respectively assigned to processors.
A directory controller selects one of the registers and writes a segmented virtual
space identifier of a requesting processor into the first field of the selected register
and a bit "1" into its second field in a position assigned to the requesting processor.
During a nonshared-access mode, the directory controller accesses a buffer area corresponding
to the selected register to load a copy of the translation tables from a main memory
into the buffer area.; During a shared-access mode, the directory controller searches
all registers to detect a first one whose first field contains the segmented virtual
space identifier of the requesting processor and whose second field contains a bit
"1" in a position assigned to that processor and a second register whose first field
contains the same identifier as the first and whose second field contains a bit "1"
in a position assigned to a master processor to indicate the denial of access right.
The bit "1" of the first register is reset to zero and a bit "1" is set in the second
field of the second register in a position assigned to the requesting processor to
indicate the grant of access right.; In a Subsequent, nonshared-access mode, one of
the bits "1" in the second register is reset in a position assigned to the requesting
processor and a copy of the translation tables is loaded into the buffer area corresponding
to the register containing instructions issued from the requesting processor
[0009] European Patent Application No. 0642086 describes a technique for translating a virtual address to a physical address. The
virtual address to be translated has a virtual page offset and a virtual page number
and addresses a page of memory of unknown size. There are L different possible page
sizes where L is a positive integer greater than one. Each of the L different page
sizes is selected to be a test page size and a test is performed. During the test,
a pointer into a translation storage buffer is calculated. The pointer is calculated
from the virtual address to be translated by assuming that the virtual address to
be translated corresponds to a mapping of the test page size. The pointer points to
a candidate translation table entry of the translation storage buffer. The candidate
translation table entry has a candidate tag and candidate data. The candidate tag
identifies a particular virtual address and the candidate data identifies a particular
physical address corresponding to the particular virtual address. A virtual address
target tag is extracted from the virtual address to be translated. The virtual address
target tag is calculated by assuming that the virtual address to be translated corresponds
to a mapping of the test page size. The target tag and the candidate tag are then
compared. If the target tag matches the candidate tag, the candidate data is provided
as the physical address translation corresponding to the virtual address to be translated.
[0010] Accordingly, there is a need for a flexible method and circuit for translating virtual
addresses to physical addresses in a multiprocessor device.
BRIEF SUMMARY OF THE INVENTION
[0011] In the present invention, there is provided a multiprocessor system and a method
as set out in the appended claims.
[0012] The present invention provides significant advantages over the prior art. First,
the invention provides for flexible determination of a physical address. Second, the
invention does not requite separate translation tables for each processing device
to be stored in the external memory.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING
[0013] For a more complete understanding of the present invention, and the advantages thereof,
reference is now made to the following descriptions taken in conjunction with the
accompanying drawings, in which:
Figure 1 is a general block diagram of a prior art multi-processor device;
Figure 2 illustrates a multiprocessor device that does not form part of the invention
but is useful for understanding thereof, using multiple operating systems, with enhanced
shared TLB capabilities;
Figure 3 illustrates a shared TLB subsystem for a multiple operating system device
that does not form part of the invention but is useful for understanding thereof;
Figure 4 illustrates a hardware translation control circuit that does not form part
of the invention but is useful for understanding thereof;
Figure 5 illustrates a software translation control system that does not form part
of the invention but is useful for understanding thereof;
Figure 6 illustrates a multiprocessor device that does not form part of the invention
but is useful for understanding thereof, using multiple operating systems, with a
shared TLB external to the L2 traffic controller;
Figure 7 illustrates a multiprocessor device, using a single operating systems, with
enhanced shared TLB capabilities;
Figure 8 illustrates a shared TLB subsystem for a single operating system device;
and
Figure 9 illustrates a block diagram of an embodiment of a shared TLB subsystem with
selectable servicing of TLB miss conditions.
DETAILED DESCRIPTION OF THE INVENTION
[0014] The present invention is best understood in relation to Figures 1 - 8 of the drawings,
like numerals being used for like elements of the various drawings.
[0015] Figure 2 illustrates a block diagram of an example of a general multiprocessor system
20. The multiprocessor system 20 includes a plurality of processors 23 coupled to
a level two (L2) traffic control circuit 22. The processors 23 coupled to the L2 traffic
control circuit 22 include a DMA processor 24, a main microprocessor 26, which includes
a coprocessor 28 and peripherals 30 and operates under a first operating system (OS1),
a processor 32 (such as a DSP) and a coprocessor 34, coupled to processor 32. Processor
32 and coprocessor 34 operate under control of their own operating system, OSk. There
may be any number of processors coupled to the L2 traffic control circuit 22, and
each processor 23 may have its own operating system or it may be controlled by an
operating system shared by a number of processors 23.
[0016] Each processor 23 is shown having a micro-TLB ("µTLB") 36, which stores a small number
of entries, typically in the order of two to eight entries, used for translations
from logical to physical addresses. The µTLBs 36 may be used in some or all of the
processors coupled to the L2 traffic control circuit 22. While a single µTLB 36 is
shown in each processor 23, multiple µTLBs 36 may be used in one or more of the processors
23. This would allow, for example, a processor to have separate µTLBs 36 for data
and instructions.
[0017] Additionally, processor 26 and processor 32 are shown with core logic 38, local memory
40, and local caches 42. Naturally, the design of its processor will vary upon its
purpose, and some processors may have significantly different designs than those shown
in Figure 2. One or more of the processors shown in Figure 2 may be digital signal
processors, and may include specialized circuitry for specific tasks.
[0018] The L2 traffic control circuit is also coupled to shared peripherals 44, an L2 cache
subsystem 46, a shared TLB subsystem 48, and to level three (L3) traffic control circuit
50. L3 traffic control circuit 50 provides an interface to other system components,
such as an external DMA processor 52, external memory 54, and display control 56.
Display control 56 is coupled to frame memory 58 and an external display 60.
[0019] In operation, each processor operates using "virtual" addresses, which must be translated
when accessing the main memory, or any other memory that uses physical addresses,
such as a L1 or L2 cache. When a processor 23 having a µTLB 36 needs to access the
external memory 54 or the L2 cache 46, it first checks its own µTLB 36 in to see if
the translation is currently stored in the µTLB 36. If so, the processor 23 retrieves
the physical address information from the µTLB 36 and uses the physical address information
to access the memory.
[0020] If there is a miss in the µTLB 36, the memory access request is forwarded to the
L2 shared TLB subsystem 48. The L2 shared TLB subsystem 48 maintains entries for multiple
processors 23, as shown in greater detail in Figure 3, discussed below. The L2 traffic
control 22 determines priority from multiple requests to the shared TLB subsystem
48 from the various processors 23.
[0021] If there is a hit in the L2 shared TLB subsystem 48, the physical address information
is retrieved from the L2 shared TLB subsystem 48 and is used to access the external
memory 54. If there is a miss in the L2 shared TLB subsystem 48, the translation tables
in the external memory are searched, as described in greater detail herein below.
[0022] Figure 3 illustrates the L2 shared TLB subsystem 48. The L2 shared TLB subsystem
48 includes a TLB control circuit 70, and a TLB memory 72. The TLB memory 72 stores
virtual address records 74 and corresponding descriptor records 76. The virtual address
records include a resource identifier (R_ID) field 78, a task identifier (Task_ID)
field 80, a lock bit (L) field 82, a virtual address field 84, a section/page (S/P)
field 86, and a valid bit (V) field 88. The descriptor records include a physical
address field 90 and an attributes field 92.
[0023] Each entry in the TLB memory 72 has a resource identifier 78 along with task identifier
80. Resource identifiers 78 and task identifiers 80 are not extension fields of the
virtual address (VA) but simply address qualifiers. A task identifier can be provided
by a task-ID register associated with each processor 23. Similarly, resource identifiers
can be provided by a resource-ID register associated with each processor 23. The task
identifier identifies all entries in a TLB belonging to a specific task. They can
be used, for instance, to invalidate all entries associated with a specific task without
affecting entries associated with other tasks. The resource ID is used because the
task identifier number on the different processors might not be related; therefore,
task related operations must be, in some cases, restricted to a resource-ID.
[0024] The resource identifier and task identifier registers are not necessarily part of
the core of each resource and can be located elsewhere in the system, such as a memory
mapped register for example, and associated to a resource bus. The only constraint
is that a task identifier register must be under the associated OS control and updated
during a context switch. The resource identifier registers must be set during the
system initialization.
[0025] Referring still to Figure 3, each TLB entry also includes a lock bit 82 to keep an
entry from being replaced. Examples of address attributes 92 are described in Table
1.
Table 1 - Memory Management Descriptors
| Execute Never |
provides access permission to protect data memory area from being executed. This information
can be combined with the access permission described above or kept separate. |
| Shared |
indicates that this page may be shared by multiple tasks across multiple processors. |
| Cacheability |
Various memory entities such as individual processor's cache and write buffer, and
shared cache and write buffer are managed through the MMU descriptor. The options
included in the present example are as follows: Inner cacheable, Outer cacheable,
Inner Write through/write back, Outer write through/write back, and Outer write allocate.
The terms Inner and outer refer to levels of caches that are be built in the system.
The boundary between inner and outer is defined, but inner will always include L1
cache. In a system with 3 levels of caches, the inner correspond to L1 and L2 cache
and the outer correspond to L3 due to existing processor systems. In the present example,
inner is L1 and outer is L2 cache. |
| Bufferability |
Describes activation or behavior of write buffer for write accesses in this page. |
| Endianism |
determines on a page basis the endianness of the transfer. |
[0026] An S/P field 86 specifies a section or page size. For example, a section may be 1
Mbyte in size, and an encoding allows page sizes of 64kb, 4kb and 1 kb to be specified.
Naturally, the page size determines how many most significant (ms) address bits are
included in a check for an entry.
[0027] A V field 88 indicates if an associated TLB cache entry is valid. V field 88 includes
several V-bits that are respectively associated with resource identifier field 78
to indicate if a valid resource identifier entry is present, task identifier field
80 to indicate if a valid task-ID entry is present, and virtual address field 84 to
indicate if a valid address entry is present.
[0028] The TLB control logic 70 may allocate the number of entries in the TLB memory 72
as desired. For example, it may be desirable to allocate a number of entries for each
processor 23. However, it is desirable that the TLB memory 72 is not segmented for
each processor 23, which would limit its flexibility. The TLB control logic, for example,
could dynamically allocate the TLB memory during operation of the system 20 for optimum
results. Alternatively, the TLB memory 72 could be allocated by task, or other criteria.
[0029] Each processor 23 accesses the TLB memory 72 by its resource identifier, a task identifier,
and a virtual address. If these three fields match a record in the TLB memory 72,
and the entry is valid, then there is a hit. In this case the physical address of
the page associated with the request is specified by the physical address field 92
of the matching record. The corresponding attributes field may specify attributes
of the page - for example, a page may be defined as read-only. In some applications,
it may be desirable to match only the resource identifier and the virtual address
or the task identifier and the virtual address.
[0030] In the event of a miss, the TLB control 70 provides a signal that there is no match
in the TLB memory 72. In this case, the present invention can use either hardware
or software to determine the physical address, without limiting the address translation
to dedicated translation tables in the external memory 54, as is the case in the prior
art.
[0031] In the system of Figures 2 and 3, the TLB memory 74 is presented with information
that includes a resource identifier field, a task identifier field and a virtual address.
When a miss occurs, any of these fields can be used by the system 20 to generate a
translated address, either using hardware or software.
[0032] Figure 4 illustrates an example of using hardware translation logic 100 in conjunction
with the TLB control circuit 70 to determine a physical address. The virtual address,
resource identifier and task identifier are presented to a logical circuit. Responsive
to a miss in the L2 TLB memory 72, a control signal is received by the logic 100.
Optionally, additional translation control information is presented to the translation
logic 100 as well. The additional translation control information could be used to
identify a desired technique for accessing the external memory 54 to perform the translation.
[0033] For example, a certain task may have an associated translation table in the external
memory 54. Upon identifying a translation request as being related to that task, the
translation logic 100 could access the translation table associated with the task
to determine the physical address. The translation table associated with the task
would be used independently of the processor requesting the translation.
[0034] Similarly, some (or all) of the various processors 23 could have associated translation
tables in the external memory 54. Upon identifying a translation request as being
related to a certain processor, the translation logic 100 could access the translation
table associated with the processor to determine the physical address.
[0035] In addition to determining the physical address, the logic 100 could perform other
functions, such as checking for protection, and so on.
[0036] Figure 5 illustrates another example where software control is used to perform the
translation in the case where there is a miss in the shared TLB subsystem 48. In this
case, one of the processors 23 (the "main" processor), such as processor 26, handles
interrupts related to misses in the shared TLB subsystem 48. When an interrupt is
received from the TLB control circuit 70, the main processor performs a translation
routine. The techniques used by the software translation can be the same as those
that would used by the hardware translation, but they are more flexible since they
can be modified easily by software, even dynamically during operation of the system
20.
[0037] The resource ID and task ID fields 78 and 80 provide significant functionality to
the shared TLB structure. For example, the task ID field could be used to flush all
TLB entries associated with a certain task, including tasks executed by multiple processors
23, when the task was finished, based on the task ID field 80. Similarly, all TLB
entries with a given processor could be flushed based on the resource ID field 78.
Other functions, such as locking or invalidating all entries based on task ID, resource
ID or a combination of both fields could be performed as well. The combination of
the shared TLB and the task and resource ID fields provides significant improvements
to the performance of address translation in the device, since the allocation of the
shared TLB resources can be optimized dynamically based on processors and task needs.
[0038] The example described above provides significant advantages over the prior art. Specifically,
it is not necessary to have separate translation tables in external memory 54 for
each processor to support translation in the event of a miss in the shared TLB subsystem
48. The translation need not be performed using simple table walking, but can be based
on any number of criteria. Further, the translation can be handled with either software
or hardware, as desired.
[0039] Figure 6 illustrates a further example of a multiprocessor system 102. This embodiment
is similar to Figure 1 in that multiprocessors 23 are operating under control of multiple
operating systems (OS1 - OSk). In this further example, however, the multiple processors
23 are not coupled to the shared TLB 48 via the L2 traffic controller 22, but through
a separate bus structure 104 and an arbitrator 106. Arbitrator 106 arbitrates between
multiple accesses to the shared TLB 48.
[0040] This further example may be more appropriate in applications where the L2 data/instruction
bandwidth is heavily utilized or if there are a large number of µTLBs 36 accessing
the shared TLB 48, which may induce latency spikes in the mainstream traffic.
[0041] Figure 7 illustrates an embodiment of a multiprocessor system where the multiprocessor
system 110 includes multiple processors 23 controlled by a single operating system
(OS). The overall architecture is similar to that of Figure 6, using a separate bus
structure to access the shared TLB 48. As before, the processors can include microprocessors,
DSPs, coprocessors and DMA processors. A main processor, shown herein as processor
26, executes the operating system.
[0042] Also, as described above, each processor 23 is shown having a micro-TLB ("µTLB")
36, which stores a small number, typically in the order of 2-3, translations from
logical to physical addresses. The µTLB 36 may be used in some or all of the processors
coupled to the L2 traffic control circuit 22.
[0043] In operation, similar to the system 20 shown in connection with Figure 2, each processor
23 operates using virtual addresses, which must be translated to physical addresses
in order to access the external memory 54. When a processor 23 having a µTLB 36 needs
to access the external memory 54 or the L2 cache 46, it first checks its own µTLB
36 in to see if the translation is currently stored in the µTLB 36. If so, the processor
23 retrieves the physical address information from the µTLB 36 and uses the physical
address information to access the external memory. In the preferred embodiment, each
processor 23 may access its µTLB 36 and the shared TLB subsystem 48 without intervention
of the main CPU 26.
[0044] If there is a miss in the µTLB 36, the memory access request is forwarded to the
L2 shared TLB subsystem 48. The L2 shared TLB subsystem 48 maintains entries for multiple
processors 23, as shown in greater detail in Figure 7, discussed below. The L2 traffic
control 22 determines priority from multiple requests to the shared TLB subsystem
48 from the various processors 23.
[0045] If there is a hit in the L2 shared TLB subsystem 48, the physical address information
is retrieved from the L2 shared TLB subsystem 48 and is used to access the external
memory 54. If there is a miss in the L2 shared TLB subsystem 48, the translation tables
in the external memory are searched, as described in greater detail herein below.
[0046] Figure 7 illustrates the L2 shared TLB subsystem 48 for a device using a single operating
system. The main difference between Figure 3, representing a multiple OS system, and
Figure 7, representing a single OS system, is that a resource identifier is normally
not needed in a single OS environment, since the OS is in control of virtual address
allocation, and therefore overlapping virtual addresses by different processors 23
for different tasks can be prevented by the OS. The L2 shared TLB subsystem 48 includes
a TLB control circuit 70, and a TLB memory 72.
[0047] As with the multiple OS system, either a hardware or software mechanism, such as
those shown in Figures 4 and 5 can be used to service misses in the shared TLB subsystem
48. In the single OS case, however, the resource identifier is generally not needed
for resolving the physical address.
[0048] It should be noted that the examples shown in Figures 2, 6 and the embodiment shown
in Figure 7 provide the DMA processors and coprocessors with a µTLB 36 and access
to the shared TLB subsystem 48. This can provide a significant improvement in efficiency.
[0049] DMA processors and coprocessors act as slave processors to an associated master processor;
i.e., DMA processors and coprocessors receive instructions from another processor,
such as an MPU or a DSP, and process information at the command of the master processor.
In the preferred embodiment, slave processors are programmed to access the shared
TLB, if their function requires translation of virtual to physical addresses. When
a memory access is needed, the slave processor can access the shared TLB (assuming
a miss in its own µTLB 36, if present) without intervention of the associated master
processor.
[0050] This aspect of the invention is particularly important when real-time functions,
such as audio/video processing, are occurring. In this case, the master processor
may not be able to perform logical-to-physical address translation for a slave processor
without compromising the quality of the real-time operation. Accordingly, by providing
and infrastructure where the slave processors can access the shared TLB directly,
the translation occurs faster and without using the computing resources of the master
processor.
[0051] The present invention provides significant advantages over the prior art. The use
of small TLBs at the processor level improves speed without significant power increases.
The shared TLB subsystem can significantly reduce the area attributed to the second
level TLB system by using a single, larger TLB, as opposed to multiple smaller TLBs
corresponding to each processor. Allocation of the entries in the shared TLB can be
balanced depending upon each processor's requirements - generally, a DSP or a DMA
process will normally require a smaller allocation than a microprocessor executing
a larger OS. The efficiency is also improved by allowing independent access to the
shared TLB by slave processors, such as DMA processors and coprocessors.
[0052] Figure 9 illustrates a block diagram of an embodiment of the invention wherein the
shared TLB subsystem 100 selects the processor that services a TLB miss condition.
Figure 9 illustrates an embodiment for a multiple OS device; however, the same basic
architecture could be used for a single OS system, as described in greater detail
below.
[0053] The architecture of the shared TLB subsystem is similar to that shown above, with
the addition of a TLB control circuit 102 including router 104 which activates an
interrupt of exception on one of a plurality of processors responsive to a TLB miss
signal (Miss_Sh_TLB) in accordance with a procedure indicated by the output of multiplexer
106. Multiplexer 106 selects between the resource_ID value present to the TLB memory
72 and a master_ID register 108 under control of a Master Select signal. The selected
processor executes a using a software handling scheme to determine the physical address,
as described above.
[0054] The processor that services the TLB miss can be selected in a number of ways. A first
approach would be to use the resource_ID presented to the shared TLB memory 72, i.e.,
the resource_ID of the processor that initiated the shared TLB access. Processors
that are under a given OS control (for example, a coprocessor which is a slave processor
to another master processor) would be assigned a common resource_ID value with its
master processor. In this case, if the slave processor initiated a shared TLB access
that resulted in a TLB miss, its master processor would receive an interrupt or exception.
In a single operating system device, the R_ID field 78 is normally not used and all
processors resource_ID would be set to a master processor. This approach could be
used to redirect the TLB miss handling routine to another processor in a fault-tolerant
system.
[0055] The second approach would direct the interrupt of exception to a processor indicated
by a preprogrammed value stored in the master_ID register 108. This approach could
be used with cooperative operating systems or with a single OS and would permit the
shared TLB to support operations based on resource_ID, such as flushing entries the
shared TLB with a given resource_ID in power down mode, while routing service requests
for TLB misses to one given processor.
[0056] Although the Detailed Description of the invention has been directed to certain exemplary
embodiments, various modifications of these embodiments, as well as alternative embodiments,
will be suggested to those skilled in the art. The invention encompasses any modifications
or alternative embodiments that fall within the scope of the Claims.
1. A multiprocessor system (20), comprising:
a plurality of processing devices (23);
a shared translation lookaside buffer (48) coupled to said plurality of processing
devices (23) for storing information relating to virtual addresses with physical address
in a main memory (54, 46), said shared translation lookaside buffer (48) generating
a fault signal if a received virtual address is not related to an entry in the shared
translation lookaside buffer (48); and
translation logic implemented partially in a software routine executed by one of said
processing devices (23) for determining physical address information responsive to
said fault signal, the received virtual address and criteria related to said received
virtual address, said criteria including an identifier (80) of a task transmitting
the received virtual address,
characterized in that said translation logic comprises circuitry (102) for selecting one of said processing
devices (23) for translating the received virtual address to a physical address.
2. The multiprocessor system of claim 1, wherein said criteria related to the received
virtual address further include an identifier (78) of the processing device (23) transmitting
the received virtual address.
3. The multiprocessor system of claim 2, further comprising
a cache memory (46); and
a level two (L2) traffic controller (22), said shared translation lookaside buffer
(48) coupled to said L2 traffic controller (22),
wherein said processing devices (23) receive information from said cache memory (46)
and said shared translation lookaside buffer (48) via said traffic controller (22).
4. The multiprocessor system of any of claims 1 to 3, further comprising:
a cache memory (46);
a level two (L2) traffic controller (22) for coupling said processing devices (23)
to said cache memory (46); and
a shared translation lookaside buffer bus (104) for coupling said processing -devices
(-23) to said shared translation lookaside buffer (48).
5. The multiprocessor system of claim 3 or claim 4, wherein two or more of said processing
devices (23) execute a respective operating system (OS1 - OSk).
6. The multiprocessor system of claim 4 when dependent on claims 1 or 3, wherein said
processing devices (23) are controlled by a single operating system.
7. The multiprocessor system of any preceding claim, where said identifying circuitry
(102) includes circuitry (104) for routing an interrupt to one of said processing
devices (23).
8. The multiprocessor system of any preceding claim, where said identifying circuitry
(102) includes circuitry (104) for routing an exception to one of said processing
devices (23).
9. The multiprocessor system of claim 8, wherein said identifying circuitry (102) includes
circuitry (106) for selecting between a resource identifier (resource_ID) that identifies
a requesting processing device (23) and a master identifier (108) that identifies
a master processing device.
10. A method of translating virtual addresses to physical addresses in a multiprocessor
system, comprising:
receiving virtual addresses from a plurality of processing devices (23) in a shared
translation lookaside buffer (48) for storing information relating virtual addresses
with physical address in a main memory (54);
generating a fault signal if a received virtual address is not related to an entry
in the shared translation lookaside buffer (48); and
executing a software routine by one of said processing devices (23) to determine physical
address information associated to said received virtual address responsive to said
fault signal, the received virtual address and criteria related to said received virtual
address, said criteria including an identifier (80) of a task transmitting the received
virtual address; and characterized by the step of selecting one of said processing devices for translating the received
virtual address to a physical address.
11. The method of claim 10, wherein said criteria related to the received virtual address
include an identifier (78) of the processing device (23) transmitting the received
virtual address.
12. The method of any of claims 10 or 11, wherein two or more of said processing devices
(23) execute a respective operating system.
13. The multiprocessor system of claims 10 or 11, wherein said processing devices (23)
are controlled by a single operating system.
14. The method of any of claims 10 to 13, where said selecting step includes routing an
interrupt to one of said processing devices (23).
15. The method of claim 10 to 13, where said selecting step includes routing an exception
to one of said processing devices (23).
16. The method of claim 15, wherein said selecting step includes selecting between a resource
identifier (resource_ID) that identifies a requesting processing device and a master
identifier (108) that identifies a master processing device.
1. Ein integrales Multiprocessing-Gerät (20) einschließlich:
Eine Vielzahl von VerarbeitungsGeräten (23);
Ein gemeinsamer Übersetzung Lookaside Buffer (48) gekoppelt zu der o.g Vielzahl von
VerarbeitungsGeräten (23), um Informationen im Zusammenhang mit virtuellen Adressen
mit einer physikalischen Adresse in einen Hauptspeicher (54,46) zu speichern, die
o.g. Übersetzung Lookaside Buffer (48) erzeugt ein Störsignal, wenn eine empfangene
virtuelle Adresse nicht in Zusammenhang mit einem Eintrag in der gemeinsamen Übersetzung
Lookaside Buffer steht (48); und
Übersetzunglogik, die teilweise in einer Software-Routine umgesetzt wird, die von
einem der o.g. VerarbeitungsGeräten (23) zur Bestimmung von physikalischen Adresse-Informationen,
die auf die o.g. Störsignal reagiert, ausgeführt wird, die empfangene virtuelle Adresse
und Kritieren in Zusammenhang mit der o.g. empfangenen virtuellen Adresse, die o.g.
Kritieren einschliesslich einen Bezeichner (80) einer Aufgabe, die die empfangene
virtuelle Adresse überträgen,
dadurch gekennzeichnet, dass die o.g. Übersetzunglogik besteht eine Schaltung (102), um eine der o.g. VerarbeitungsGeräte
(23) für die Übersetzung von der empfangenen virtuellen Adresse zu einer physikalischen
Adresse auszuwählen.
2. Das integrale Multiprocessing-Gerät nach Anspruch 1, wobei den o.g. Kritieren in Zusammenhang
mit der empfangenen virtuellen Adresse enthalten dazu einen Bezeichner (78) des VerarbeitungsGeräts
(23), der die empfangene virtuelle Adresse überträgt.
3. Das integrale Multiprocessing-Gerät nach Anspruch 2, besteht dazu
Eine Cache- Speicher (46) und
Eine Level 2 (L2) Datenverkehrsteuerung (22), den o.g. Übersetzung Lookaside Buffer
(48) gekoppelt zu der o.g. L2 Datenverkehrsteuerung (22),
wobei den o.g. Verarbeitungsgeräten (23) Informationen von dem o.g. Cache Speicher
(46) und von dem o.g. gemeinsamen Übersetzung Lookaside Buffer (48) mittels der o.g.
Datenverkehrsteuerung empfangen (22)
4. Das integrale Multiprocessing Gerät nach Ansprüchen 1 bis 3, besteht dazu
Einen Cache-Speicher (46);
Eine Level 2 (L2) Datenverkehrsteuerung (22), um die o.g. Verarbeitungsgeräten (23)
zum o.g. Cache-Speicher (46) zu koppeln; und
einen gemeinsamen Übersetzung Lookaside Buffer Bus (104), um die o.g. Verarbeitungsgeräte
(23) zur gemeinsamen Übersetzung Lookaside Buffer (48) zu koppeln.
5. Das Multiprocessing-Gerät nach Ansprüchen 3 oder 4, wobei 2 oder mehr von den o.g.
Verarbeitungsgeräten (23) eine entsprechende Betriebssystem führen.
6. Das integrale Multiprocessing-Gerät nach Anspruch 4, wenn abhängig von Anspruch 1
oder 3, wobei den o.g. Verarbeitungsgeräten (23) von einem einzigen Betriebssystem
gesteuert werden.
7. Das Multiprocessing-Gerät von allen vorhergehenden Ansprüchen, wobei der o.g. bezeichnende
Schaltung (102) die Schaltung (104) enthalt, die einen Unterbrecher zu einem der o.g.
Verarbeitungsgeräte (23) weiterleitet.
8. Das Multiprocessing-Gerät von allen vorhergehenden Ansprüchen, wobei der o.g. bezeichnenden
Schaltung (102) die Schaltung enthalt, die eine Ausnahme zu einem der o.g. Verarbeitungsgeräte
(23) weiterleitet.
9. Das Multiprocessing-Gerät nach Anspruch 8, wobei der o.g. bezeichnenden Schaltung
(102) enthalt eine Schaltung (106), um zwischen dem Betreibsmittelbezeichner (resource_ID)
der ein anfordendes Verarbeitungsgerät (23) bezeichnet, und dem Hauptbezeichner (108),
der ein Hauptverarbeitungsgerät identifiziert, zu wählen.
10. Ein Verfahren für die Übersetzung von virtuellen Adressen zu physikalischen Adressen
in einem Multiprocessing-Gerät, einschließlich:
Die Empfang den virtuellen Adressen von einer Vielzahl von Verarbeitungsgeräten (23)
in einem gemeinsamen Übersetzung Lookaside Buffer (48), der die Informationen über
virtuellen Adressen mit physikaliscen Adressen in einem Hauptspeicher (54) zu speichern;
Die Erzeugung von einem Störsignal, wenn eine empfangene virtuelle Adresse nicht in
Zusammenhang mit einem Eintrag in dem gemeinsamen Übersetzung Lookaside Buffer (48)
steht; und
Die Ausführung von einer Software-Routine von einem den o.g. Verarbeitungsgeräten
(23), für die Stimmung von physikalischen Adresse-Informationen, die in Verbindung
mit den o.g. empfangenen virtuellen Addresse stehen, die auf das Störsignal reagiert,
die empfangene virtuelle Adresse und Kriterien, die in Verbindung mit der o.g. empfangenen
virtuellen Adresse stehen, die o.g. Kriterien einschliesslich einen Bezeichner (80)
von einer Aufgabe, der die empfangene virtuelle Adresse überträgt; und gekennzeichnet durch den Schritt der Auswahl von einem den o.g. Verarbeitungsgeräten, um die empfangene
virtuelle Adresse zu einer physikalischen Adresse überzusetzen.
11. Das Verfahren nach Anspruch 10, wobei den o.g. Kriterien in Zusammenhang mit der empfangenen
virtuellen Adresse einschliessen einen Bezeichner (78) von dem Verarbeitungsgerät
(23), der die empfangene virtuelle Adresse überträgt.
12. Das Verfahren nach einem der Ansprüche 10 oder 11, wobei zwei oder mehr von den o.g.
Verarbeitungsgeräten (23) ein entsprechendes Betriebssystem ausführen.
13. Das integrale Multiprocessing-Gerät von Ansprüchen 10 oder 11, wobei den o.g. Verarbeitungsgeräten
(23) von einem einzigen Betriebssystems gesteuert werden.
14. Das Verfahren von allen Ansprüchen von 10 bis 13, wobei dem o.g. Auswahlschritt die
Weiterleitung von einem Unterbrecher zu einem den o.g. Verarbeitungsgeräten (23) enthaltet.
15. Das Verfahren von Ansprüchen 10 bis 13, wobei dem o.g. Auswahlschritt die Weiterleitung
von einer Ausnahme zu einem den o.g. Verarbeitungsgeräten (23) enthaltet.
16. Das Verfahren von Ansprüch 15, wobei dem o.g. Auswahlschritt die Auswahl zwischen
dem Betriebsmittelbezeichner (Resource_ID), der ein anfordende Verarbeitungsgerät
bezeichnet, und dem Hauptbezeichner (108), der ein Hauptverarbeitungsgerät bezeichnet,
enthaltet.
1. un système de multi-traitement (20) comprenant:
une pluralité de dispositifs de traitement (23);
un TLB commun (48) couplé à ladite pluralité de dispositifs de traitement (23) pour
la mise en mémoire d'information relatives aux adresses virtuelles avec une adresse physique dans une mémoire principale (54, 46), ledit TLB commun (48) qui
génère un signal de défaut si une adresse reçue virtuelle n'est pas liée à une entrée
dans le TLB commun (48); et
une logique de traduction implémentée partiellement dans une routine de logiciel executée
par un desdits dispositifs de traitement (23) pour déterminer l'information d'adresse
sensible audit signal de défaut, l'adresse reçue virtuelle et le critère relatif à
ladite adresse reçue virtuelle, ledit critère comprenant un identifiant (80) d'une
tâche, qui transmet l'adresse reçue virtuelle,
characterisé en ce que la logique de traduction comprend un circuit (102) qui sélectionne
un desdits dispositifs (23) qui traduit l'adresse reçue virtuelle en une adresse physique.
2. Le système de multitraitement de la revendication 1, dans lequel ledit critère relatif
à l'adresse reçue virtuelle comprend en outre un identifiant (78) du dispositif de
traitement (23) qui transmet l'adresse reçue virtuelle.
3. Le système de multitraitement de la revendication 2 comprend en outre:
une mémoire cache (46); et
un seconde niveau (L2) controlleur de trafic (22), ledit TLB commun couplé audit controlleur
de trafic L2 (22), dans lequel lesdits dispositifs de traitement (23) reçoivent l'information
de ladite mémoire cache (46) et dudit TLB commun (48) par ledit controlleur de traffic
de données (22).
4. Le système de multitraitement selon les revendications d'une à trois comprend en outre:
une mémoire cache (46);
un seconde niveau controlleur de traffic (22) pour coupler lesdits dispositifs de
traitement (23) à la mémoire cache (46); et
un système de bus de lookaside buffer commun (104) pour coupler lesdits dispositifs
de traitement (23) audit TLB commun (48).
5. Le système de multitraitement selon la revendication 3 ou 4, dans lequel deux ou plus
desdits dispositifs de traitement (23) exécutent un système d'exploitation respectif
(OSl-OSk).
6. Le système de multitraitement selon la revendication 4 si il dépend des revendications
1 ou 3, dans lequel lesdits dispositifs de traitement (23) sont controllés par un
seul système d'exploitation.
7. Le système de multitraitement selon lesdites revendications où ledit circuit identifiant
(102) comprend un circuit (104) pour le routage d'interruption à l'un desdits dispositifs
de traitement (23).
8. Le système de multitraitement selon lesdites revendications, où ledit circuit identifiant
(102) comprend un circuit (104) pour le routage d'une exception à l'un desdits dispositifs
de traitement (23).
9. Le système de multitraitement selon la revendication 8, dans lequel ledit circuit
identifiant (102) comprend le circuit (106) pour choisir entre un identifiant de ressource
(resource_ID) qui identifie un dispositif de traitement de la demande (23) et un identifiant
principal (108) qui identifie un dispositif principal de traitement.
10. Une méthode de traduction des adresses virtuelles en adresses physiques dans un système
de multitraitement, comprenant:
la réception d'adresses virtuelles d'une pluralité de dispositifs de traitement (23)
dans un TLB commun (48) pour mettre en mémoire des informations qui lient les adresses
virtuelles aux adresses physiques dans une mémoire principale (54);
la production d'un signal de défaut si une adresse reçue virtuelle n'est pas liée
à une entrée dans le TLB commun (48); et
l'exécution d'une routine de logiciel par un desdits dispositifs de traitement (23)
pour déterminer l'information de l'adresse physique associée à ladite adresse reçue
virtuelle sensible audit signal de défaut, l'adresse reçue virtuelle et un critère
relatif à ladite adresse reçue virtuelle, ledit critère comprenant un identifiant
(80) d'une tâche qui transmet l'adresse reçue virtuelle; et characterisé par l'étape
de la sélection d'un desdits dispositifs de traitement pour la traduction de l'adresse
reçue virtuelle en une adresse physique.
11. La méthode selon la revendication 10, dans laquelle ledit critère relatif à l'adresse
reçue virtuelle comprend un identifiant (78) du dispositif de traitement (23) qui
transmet l'adresse reçue virtuelle.
12. La méthode selon les revendications 10 ou 11, dans laquelle deux ou plus desdits dispositifs
de traitement (23) exécutent un système d'exploitation respectif.
13. Le système de multitraitement selon les revendications 10 ou 11, dans lequel lesdits
dispositifs de traitement (23) sont controllés par un seul système d'exploitation.
14. La méthode selon les revendications de 10 à 13, dans laquelle ladite étape de la sélection
comprend le routage d'interruption à l'un desdits dispositifs de traitement (23)
15. La méthode selon les revendications de 10 à 13, dans laquelle ladite étape de l'indentification
comprend le routage d'une exception à l'un desdits dispositifs de traitement (23).
16. La méthode selon la revendication 15, dans laquelle ladite étape de l'identification
comprend l'étape de sélectionner entre un identifiant de ressource (resource_ID) qui
identifie un dispositif de demande de traitement et un identifiant principal (108)
qui identifie un dispositif principal de traitement.