(19)
(11)EP 2 483 783 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
05.07.2017 Bulletin 2017/27

(21)Application number: 10760977.8

(22)Date of filing:  22.09.2010
(51)Int. Cl.: 
G06F 12/10  (2016.01)
G06F 12/02  (2006.01)
(86)International application number:
PCT/EP2010/063950
(87)International publication number:
WO 2011/039084 (07.04.2011 Gazette  2011/14)

(54)

FACILITATING MEMORY ACCESSES

ERMÖGLICHUNG DES ZUGANGS ZU EINEM SPEICHER

FACILITATION D'ACCÈS À LA MÉMOIRE


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

(30)Priority: 29.09.2009 US 569004

(43)Date of publication of application:
08.08.2012 Bulletin 2012/32

(73)Proprietor: International Business Machines Corporation
Armonk, NY 10504 (US)

(72)Inventors:
  • SHEIKH, Ali
    Markham, Ontario L6G 1C7 (CA)
  • GYURIS, Viktor
    Poughkeepsie, New York 12601 (US)
  • STEWART, Kirk, Andrew
    Poughkeepsie, New York 12601 (US)

(74)Representative: Williams, Julian David 
IBM United Kingdom Limited Intellectual Property Law Mailpoint 110
Hursley Park Winchester Hampshire SO21 2JN
Hursley Park Winchester Hampshire SO21 2JN (GB)


(56)References cited: : 
WO-A1-2006/060198
US-A1- 2007 180 218
US-B1- 6 247 107
US-A1- 2007 005 933
US-A1- 2009 150 645
  
      
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


    Description

    BACKGROUND



    [0001] This invention relates, in general, to processing within a computing environment, and in particular, to facilitating memory accesses within such an environment.

    [0002] Efficient memory access is important to the overall performance of a computing environment. Since physical memory is of a finite size, in many computing environments, virtual addresses are used to reference memory, allowing use of a larger address range than is physically available. Those virtual addresses are then translated from virtual addresses to absolute addresses, which correspond directly to physical locations in memory.

    [0003] To translate a virtual address to an absolute address, various operations may be used. As one example, in computer architectures that implement hierarchical page tables, a page walk of the hierarchical page tables is performed to locate the absolute address corresponding to the virtual address. In another example in which the computer architecture uses inverted page tables, a hashtable lookup of the inverted page tables is performed to locate the absolute address corresponding to the virtual address. These operations are very expensive, and since address translations are very common, these operations often have an impact on system performance.

    [0004] In an effort to reduce this impact on performance, techniques have been employed to improve address translation. Once such technique includes the use of a translation lookaside buffer (TLB), which caches the results of recent translations. This buffer is consulted before, for instance, a page walk is performed, and if an entry is found for the desired virtual address, it provides a direct mapping to the corresponding absolute address. This avoids repeating expensive page walks for commonly used translations, thereby improving performance US2007/005933 discusses determining whether two addresses of the same type are to access the same unit of memory. However it does not discuss how to handle the situation in which it is an instruction address and an operand address.

    BRIEF SUMMARY



    [0005] Translation lookaside buffers that are implemented in hardware can be efficiently implemented in associative memory, and are, therefore, very fast. However, software TLBs, which are used in certain environments, such as those that employ system virtual machines or emulators (i.e., software to create a virtual execution environment), are typically implemented as a hashtable. The cost of a hashtable lookup is less than that of a full page walk, but is still very expensive as it is performed at every memory access.

    [0006] While TLBs reduce the cost of address translation by making some address translations more efficient, they do not reduce the total number of address translations performed. Thus, a capability is provided to eliminate (either partially or fully) the need for some address translations.

    [0007] Moreover, in an effort to further improve system performance, a capability is provided to eliminate certain access checking used to obtain access permissions.

    [0008] The shortcomings of the prior art are overcome and additional advantages are provided through the provision of a computer program product for facilitating memory accesses in a computing environment. The computer program product includes a storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. The method includes, for instance, obtaining a first address to be used in a memory access, the first address indirectly usable in the memory access; determining whether address translation is to be omitted for the first address, wherein address translation includes at least one of a traversal of one or more data structures to locate therein a translated address for the first address or having the translated address in a translation cache that includes one or more translated addresses; and in response to determining that address translation is to be omitted, generating absent address translation a second address directly usable in the memory access, the second address corresponding to the first address and being unequal to the first address.

    [0009] As a further example, a computer program product for facilitating memory accesses is provided. The computer program product includes a storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. The method includes, for instance, determining whether an instruction includes an indication that it is likely to access data on a same unit of memory in which the instruction resides; in response to the determining indicating the instruction includes such an indication, checking whether the instruction is to access data within the same unit of memory in which the instruction resides, the data having a first address associated therewith, the first address indirectly usable in accessing the data; and in response to the checking indicating the instruction and the data are within the same unit of memory, generating a second address for the data, the second address being directly usable in accessing the data, and wherein the generating includes using an address of the instruction to generate the second address of the data.

    [0010] Methods and systems relating to one or more aspects of the present invention are also described and claimed herein. Further, services relating to one or more aspects of the present invention are also described and may be claimed herein.

    [0011] Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.

    BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS



    [0012] One or more aspects of the present invention are particularly pointed out and distinctly claimed as examples in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:

    FIG. 1 depicts one embodiment of a computing environment to incorporate and use one or more aspects of the present invention;

    FIG. 2 depicts one embodiment of a system architecture of the computing environment of FIG. 1, in accordance with an aspect of the present invention;

    FIG. 3 depicts further details of one embodiment of an emulator of the system architecture of FIG. 2, in accordance with an aspect of the present invention;

    FIG. 4A depicts further details of one embodiment of a central processing unit (CPU) implementation of the emulator of FIG. 3, in accordance with an aspect of the present invention;

    FIG. 4B depicts further details of one embodiment of interpreter code of the CPU implementation of FIG. 4A, in accordance with an aspect of the present invention;

    FIG. 5 depicts one embodiment of the logic used to omit address translations and/or access checks, in accordance with an aspect of the present invention;

    FIG. 6 depicts one embodiment of the logic used to indicate whether address translations and/or access checks may be omitted, in accordance with an aspect of the present invention; and

    FIG. 7 depicts one embodiment of a computer program product incorporating one or more aspects of the present invention.


    DETAILED DESCRIPTION



    [0013] In accordance with an aspect of the present invention, a facility is provided for omitting address translations in certain circumstances, thereby enhancing system performance. As an example, address translation for data (e.g., an operand) is omitted if the data resides on the same unit of memory (e.g., page) as the instruction accessing that data. In such a situation, address translation is not needed for the data; instead, the absolute address of the data is derived from the absolute address of the instruction. For instance, an offset is applied to the absolute address of the instruction to obtain the absolute address of the data.

    [0014] In addition to omitting address translations, in a further enhancement, access checking may also be omitted in certain circumstances. For instance, if the access checking has already been performed for the data, then it can be omitted. That is, if the access checking for the data has already been performed and has been successful (i.e., permission granted), then as long as the same page (or other unit of memory) is being accessed and the access permissions have not changed, future access checking for this data can be omitted. If, however, the permissions change or another page is to be accessed (as specified by an indicator as described below), access checking is to take place. In one embodiment, the omission of access checking is performed for instructions that access data on the same page as the instruction. However, in other embodiments, the omission of access checking is separate from the omission of address translations.

    [0015] One embodiment of a computing environment to incorporate and use one or more aspects of the present invention is described with reference to FIG. 1. In this example, a computing environment 100 is based on one architecture, which may be referred to as a native architecture, but emulates another architecture, which may be referred to as a guest architecture. As examples, the native architecture is the Power4 or PowerPC® architecture offered by International Business Machines Corporation, Armonk, New York, or an Intel® architecture offered by Intel Corporation; and the guest architecture is the z/Architecture® also offered by International Business Machines Corporation. Aspects of the z/Architecture® are described in "z/Architecture-Principles of Operation," IBM Publication No. SA22-7832-07, February 2009, which is hereby incorporated herein by reference in its entirety.

    [0016] Computing environment 100 includes, for instance, a native processor 102 (e.g., central processing unit (CPU)), a memory 104 (e.g., main memory), and one or more input/output (I/O) devices 106 coupled to one another via, for example, one or more buses 108 or other connections. As one example, processor 102 is part of a pSeries® server offered by International Business Machines Corporation (IBM®), Armonk, New York. IBM®, pSeries®, PowerPC® and z/Architecture® are registered trademarks of International Business Machines Corporation, Armonk, New York, USA. Intel® is a registered trademark of Intel Corporation. Other names used herein may be registered trademarks, trademarks or product names of International Business Machines Corporation or other companies.
    Native central processing unit 102 includes one or more native registers 110, such as one or more general purpose registers and/or one or more special purpose registers, used during processing within the environment. These registers include information that represent the state of the environment at any particular point in time.

    [0017] To provide emulation, the computing environment is architected to include an emulator (a.k.a., system virtual machine), a guest operating system and one or more guest applications. These architected features are further described with reference to FIG. 2.

    [0018] Referring to FIG. 2, one embodiment of a system architecture 200 of computing environment 100 is described. System architecture 200 includes, for instance, a plurality of implementation layers, which define the architected aspects of the environment. In this particular example, the layers include hardware 202, which is coupled to memory 204 and input/output devices and/or networks 206 via one or more interfaces and/or controllers; a host operating system 208; an emulator 210; a guest operating system 212; and one or more guest applications 214; as examples. One layer is coupled to at least one other layer via one or more interfaces. For instance, guest applications 214 are coupled to guest operating system 212 via at least one interface. Other interfaces are used to couple the other layers. Moreover, the architecture can also include other layers and/or interfaces. Various of the layers depicted in FIG. 2 are further described below.

    [0019] Hardware 202 is the native architecture of the computing environment and is based on, for instance, Power4, PowerPC®, Intel®, or other architectures. Running on the hardware is host operating system 208, such as AIX® offered by International Business Machines Corporation, or Linux. AIX® is a registered trademark of International Business Machines Corporation, Armonk, New York.

    [0020] Emulator 210 includes a number of components used to emulate an architecture that may differ from the native architecture. In this embodiment, the architecture being emulated is the z/Architecture® offered by IBM®, but other architectures may be emulated as well. The emulation enables guest operating system 212 (e.g, z/OS®, a registered trademark of International Business Machines Corporation) to execute on the native architecture and enables the support of one or more guest applications 214 (e.g., Z applications). Further details regarding emulator 210 are described with reference to FIG. 3.

    [0021] Referring to FIG. 3, emulator 210 includes a shared memory 300 coupled to one or more service processes 302, an input/output (I/O) implementation 304, and a central processing unit (CPU) implementation 306, each of which is described in further detail below.

    [0022] Shared memory 300 is a representation of a portion of memory in the host that is visible from service processes 302, I/O implementation 304, and CPU implementation 306. It is a storage area in which the independent processes (e.g., service processes, I/O implementation, CPU implementation) communicate by reading and storing data into the shared memory. As one example, the shared memory includes a plurality of regions including, for instance, system global information, CPU contexts and information, emulated main storage, emulated main storage keys, and subchannels (i.e., data structures that represent I/O devices).

    [0023] Service processes 302 include one or more processes used to create the CPUs and one or more other processes, as well as provide architected operator facilities, such as start, stop, reset, initial program load (IPL), etc. It may also provide other functions, such as displays or alteration of emulated system facilities, obtaining/freeing shared resources, other maintenance commands, etc.

    [0024] Input/output implementation 304 includes, for instance, one or more subchannel processes and an I/O controller used to communicate with I/O devices. The I/O controller is responsible for starting the subchannel processes and performing recovery.

    [0025] Central processing unit (CPU) implementation 306 is responsible for executing instructions and managing the processing. It includes a number of components, which are described with reference to FIGs. 4A-4B.

    [0026] Referring to FIG. 4A, CPU implementation 306 includes, for instance, interpreter code 400 used to fetch, translate and execute instructions; an architectured co-processor 402 that aids in initial start-up and communication with the chip (e.g., Service Call Logical Processor (SCLP) processes); and timing facilities 404 that are responsible for timing functions of the emulator. Further details regarding interpreter code 400 are described with reference to FIG. 4B.

    [0027] Interpreter code 400 includes, for instance, an interpretation unit 420 coupled to a memory access unit 422, a CPU control 426, an asynchronous interruption handler 428 and a synchronous interruption handler 430.

    [0028] Interpretation unit 420 is responsible for obtaining one or more guest instructions from memory, providing native instructions for the guest instructions, and executing the native instructions. The guest instructions comprise software instructions (e.g., machine instructions) that were developed to be executed in an architecture other than that of native CPU 102. For example, the guest instructions may have been designed to execute on a z/Architecture® processor, but are instead being emulated on native CPU 102, which may be, for instance, a pSeries® server.

    [0029] In one example, the providing of the native instructions includes selecting a code segment in the emulator that is associated with the guest instruction. For instance, each guest instruction has an associated code segment in the emulator, which includes a sequence of one or more native instructions, and that code segment is selected to be executed. In a further example, the providing includes creating during, for instance, a translation process, a native stream of instructions for a given set of guest instructions. This includes identifying the functions and creating the equivalent native instructions.

    [0030] If an instruction includes a memory access, then memory access routines 422 are used to access shared memory 300. The memory access routines may use translation mechanisms, such as dynamic address translation (DAT) 432 or access register translation (ART) 434, to translate a virtual (or logical) address to an absolute address, which is then used to access the memory or may be further translated, if needed.

    [0031] In this embodiment, the processing within interpretation unit 420 is to be streamlined. Thus, if a more complex circumstance arises, such as a wait state, changing from one architecture level to another architecture level (e.g., z/Architecture® to ESA/390, etc.), control is transferred to CPU control 426, which handles the event and then returns control to interpretation unit 420.

    [0032] Further, if an interrupt occurs, then processing transitions from interpretation unit 420 to either asynchronous interruption handler 428, if it is an asynchronous interruption, or synchronous interruption handler 430, if it is a synchronous interruption. After the interrupt is handled, processing returns to interpretation unit 420. In particular, the interpretation unit monitors certain locations in shared memory and if a location has changed, it signifies an interrupt has been set by the CPU or I/O. Thus, the interpretation unit calls the appropriate interruption handler.

    [0033] To improve memory access, some architectures, such as the z/Architecture® offered by International Business Machines Corporation, use a translation lookaside buffer (TLB) to store addresses that have been translated by DAT or ART, as examples. Then, when a request is received for a unit of memory (e.g., a page) addressed by a translated address, the address is obtained from the cache without having to wait for the expensive translation to be performed again. While this improves system performance, performance can be further enhanced by omitting certain address translations all together (i.e., not performing the translation at all for a particular address, as opposed to performing it earlier and saving the results).

    [0034] In one example, address translations are omitted for data (e.g., operands) when the data resides on the same unit of memory (e.g., page) as the instruction accessing the data. As used herein, address translation includes translating a first address (e.g., a virtual address) to a second address (e.g., an absolute address) using a translation operation, such as one or more of the following, as examples: a traversal of data structures to obtain the second address corresponding to the first address, including, but not limited to, a page walk of hierarchical page tables to obtain the second address or a hashtable lookup of inverted page tables to obtain the second address; having the second address in a translation lookaside buffer (or other cache) after having been previously translated; and/or using dynamic address translation or access register translation of the first address to obtain the second address. In one example, generation of the second address from a third address (e.g., by applying an offset to the third address), in which the third address is for a different entity (e.g., the instruction) than the first or second addresses (e.g., the data) is not considered address translation herein.

    [0035] One embodiment of the logic used to omit address translations is described with reference to FIG. 5. Further, FIG. 5 also includes one embodiment of the logic used to omit access checking. In one example, the logic of FIG. 5 is performed by, for instance, an emulator, and in particular, the interpreter code of the emulator. This logic is performed, for instance, when an instruction is executed in interpretative mode.

    [0036] Referring to FIG. 5, during execution within the emulator, an instruction pointer is obtained that indicates the next instruction to be executed. In this example, that pointer is an address that is indirectly usable to access memory (e.g., a virtual address), and therefore, that address is translated to obtain an address that is directly usable to access memory (e.g., the absolute address of the memory location at which the instruction is loaded), STEP 500. This translation is accomplished by, for instance, performing a page walk of hierarchical page tables, as described in the above-referenced Principles of Operation. Other techniques may also be used.

    [0037] The instruction is then fetched and decoded, STEP 502. During decoding, a determination is made as to whether the instruction accesses memory (e.g., a load, store), INQUIRY 504. If the instruction does not access memory, then the instruction is executed, STEP 506, and processing continues to the next instruction, STEP 500.

    [0038] However, if the instruction does access memory, INQUIRY 504, then a further determination is made as to whether the instruction includes an indicator that signifies that this instruction often accesses data (e.g., operands) that are on the same page as the instruction itself, INQUIRY 506. If the indicator (referred to herein as a translation omission mark) is not present, then the operand address is also translated in the same manner as the instruction (e.g., page walk; locate in TLB; etc.), STEP 508. On the other hand, if there is a translation omission mark in the instruction that provides a hint (not a guarantee) that the instruction often accesses data on the same page as the instruction, then a further determination is made as to whether the data is actually on the same page as the instruction, INQUIRY 510. In one example, this determination is made by comparing the virtual address of the instruction with the virtual address of the operand. For example, the address of the instruction includes a plurality of bits, a first portion of the bits indicating an address of the page including that instruction and a second portion of the bits (e.g., the last 12 bits for a 4k page) including an offset into the page. Similarly, the address of the operand also includes a plurality of bits, a first portion indicating an address of the page including that operand and a second portion including an offset into the page. Therefore, the first portion of the instruction address and the first portion of the operand address are compared.

    [0039] If the comparison indicates that the two are not on the same page, then the operand address is translated, STEP 508. However, if the operand and instruction are on the same page, then the translation can be skipped and the operand absolute address is generated from the absolute address of the instruction, STEP 512. This is accomplished by, for example, using an offset (e.g., instruction virtual address - operand virtual address is the offset) and adding that to (or in another example, subtracting it from) the instruction absolute address to obtain the absolute address of the operand.

    [0040] In a further aspect of the present invention, a determination is made as to whether access checking of the data may be omitted. That is, do the access permissions for this instruction need to be checked or can this be omitted. To this end, the instruction is checked to determine if there is an indicator (referred to herein as an access check mark) that indicates that access checking does not need to be performed, INQUIRY 514. Unlike the translation omission mark, this mark is not a hint, but a guarantee that the access check of the data does not need to be performed. If the access check mark is present, it indicates that the access check has already been performed for the data, and therefore, does not need to be repeated. The instruction has the permissions it needs to access the data. Thus, the instruction is simply executed, STEP 506. However, if the access check mark is not in instruction, then a determination is made as to whether the operand access is permitted, INQUIRY 516. This is determined by checking, for instance, an access data structure that indicates the permissions. If access is permitted, then the instruction is executed, STEP 506. If not, then an exception is raised, STEP 518.

    [0041] Returning to STEP 508, after the operand address is translated, a determination is made as to whether the translation is valid, INQUIRY 520. If not, an exception is raised, STEP 518. Otherwise, processing continues with INQUIRY 516, in which it is determined whether operand access is permitted. If not, an exception is raised; otherwise, the instruction is executed, STEP 506. This completes processing.

    [0042] One embodiment of the details associated with placing one or more of the marks in the instruction are described with reference to FIG. 6. In this example, the emulator identifies traces of code (e.g., sequences of instructions in guest code) and compiles those traces of code to provide native code that corresponds to the guest code. The compiler is, for instance, a Just-in-Time (JIT) compiler, such as the JAVA Just-in-Time (JIT) compiler offered by International Business Machines Corporation, Armonk, New York. JAVA is a trademark of Sun Microsystems, Inc., Santa Clara, California.

    [0043] Referring to FIG. 6, in one example, guest code is executed, STEP 600. The code can be interpreted code or compiled code, as described below. If interpreted code, then in response to encountering an instruction to be executed, the instruction is executed, as described with reference to FIG. 5. If the code is compiled code, then the native code generated based on the logic of FIG. 5 is executed, as described below.

    [0044] Continuing with FIG. 6, at a predefined time, a determination is made as to whether a trace is to be recorded, INQUIRY 602. If not, then guest code continues to be executed. However, if a trace is to be recorded, then processing continues with recording the trace, STEP 604. This includes identifying sequences of instructions in guest code. During the recording of the trace, one or more of the marks may be inserted into one or more instructions, STEP 606. For instance, in response to encountering a new instruction that is to be placed in the trace, a determination is made as to whether the instruction has previously accessed operands on the same page as the instruction. This information is maintained, for instance, in a data structure. In particular, during previous executions of the instruction, the instruction was profiled to determine if it often accessed data on the same page as the instruction. For example, a count may be kept of the number of times the instruction executed and the number of times it accessed data on the same page. If, as an example, it is over a certain user defined percentage (e.g., 50% or any other desired percentage), it would be indicated as often accessing data on the same page as the instruction. Many other examples may also be provided to determine whether the instruction often accesses data on the same page as the instruction. The example provided above is only one example. The determination is very implementation specific and tunable by the user (or heuristically by the system) based on the environment and other factors. The exact number is not important to one or more aspects of the present invention.

    [0045] In response to a determination that the instruction often accesses data on the same page as the instruction, the data structure provides an indication of such, and based on this information, the translation omission mark is placed in the instruction.

    [0046] Similarly, a determination is made as to the access checking. If it is determined that the access checking has already been performed for this instruction, and thus, can be omitted, then the access check mark is also placed in the instruction. It should be noted that for each instruction in the trace, one, both or none of the marks may be inserted depending on the situation.

    [0047] When the trace is finished being recorded, which is user defined, the trace is compiled to produce native code, STEP 608. During this compilation, the marks are checked and translations are omitted when possible. For instance, when an instruction is encountered during compilation, a determination is made as to whether the translation omission mark is in the instruction. If so, the address comparison is performed (as in STEP 510 of FIG. 5) and if the comparison indicates the same page, the operand absolute address is generated absent any address translation of the operand virtual address. Further, a determination is made as to whether the access check mark is in the instruction. If so, the access checks are omitted. By performing these checks during compilation, the compiled native code codifies these omissions.

    [0048] In response to compiling the trace, the trace is inserted into a code cache 610, which is a collection of native code that corresponds to a portion of code in guest memory. When these portions of code are encountered in the future, the compilation step can be skipped and the native code can be directly executed.

    [0049] The native code corresponds directly to guest code and to guest machine state. Therefore, if the machine state changes, then that cache version of native code is no longer valid because it does not reflect the machine state. In such an event, the code in the cache is removed. This is performed by a trace invalidator 612. The trace invalidator is a monitor that keeps track of what is going on in the system. If access permission or other things of the guest machine change, it invalidates the code in the cache.

    [0050] Described in detail above is a capability for omitting address translations and/or access checks in certain circumstances. For example, an emulator can compare the virtual address of an instruction operand's memory reference with that of the instruction performing the reference. If the two are on the same page, then the emulator can reuse the result of the instruction fetch address translation when forming the memory reference's absolute address, thereby avoiding a second translation, including a TLB lookup.

    [0051] Adding an address comparison to all instruction operand memory accesses adds overhead, however. This overhead is worthwhile if the check usually succeeds (i.e., the instruction and operand addresses are on the same page), as savings resulting from omitting the operand address translation offset the added cost of the check. If the check usually fails, however, the added cost is not amortized. It is therefore desirable to only perform the address check on instructions where it will generally succeed. This can be accomplished in an emulator by marking instructions when they have been found to access memory on the same page on one or more previous executions. When a marked instruction is encountered on future executions, the operand address translation can be omitted if the address check succeeds. This approach is particularly suited to emulators that include JIT compilers; the emulator could mark instructions before passing them to the compiler. The compiler could then generate code with or without address checks and translations accordingly.
    The address translation check allows an emulator to derive an instruction operand's absolute address from the absolute address of the instruction itself; however, it does not remove the need for an access check to be performed on the instruction operand. The translation of the instruction address verifies that a page has execute permission, but it does not check if the page also has write permission (nor does it check for read permission on some architectures). Therefore, the address translation check is unable to remove access checks for many operand accesses.

    [0052] A further refinement to the address translation check can enable omission of the access checks for instructions that access data on the same page, and thereby provide further benefits. When an instruction is marked as accessing data on the same page, a check can be made as to whether the page allows read and/or write permission (note that the read check is not necessary if execute implies read). If it does, a separate mark is placed on the instruction. Both translations and access checks can be omitted for instructions so marked.

    [0053] As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit," "module" or "system". Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.

    [0054] Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.

    [0055] Referring now to FIG. 7, in one example, a computer program product 700 includes, for instance, one or more computer readable media 702 to store computer readable program code means or logic 704 thereon to provide and facilitate one or more aspects of the present invention.

    [0056] Program code embodied on a computer readable medium may be transmitted using an appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

    [0057] Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language, such as Java, Smalltalk, C++ or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).

    [0058] Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

    [0059] These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

    [0060] The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

    [0061] The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. In addition to the above, one or more aspects of the present invention may be provided, offered, deployed, managed, serviced, etc. by a service provider who offers management of customer environments. For instance, the service provider can create, maintain, support, etc. computer code and/or a computer infrastructure that performs one or more aspects of the present invention for one or more customers. In return, the service provider may receive payment from the customer under a subscription and/or fee agreement, as examples. Additionally or alternatively, the service provider may receive payment from the sale of advertising content to one or more third parties.

    [0062] In one aspect of the present invention, an application may be deployed for performing one or more aspects of the present invention. As one example, the deploying of an application comprises providing computer infrastructure operable to perform one or more aspects of the present invention.

    [0063] As a further aspect of the present invention, a computing infrastructure may be deployed comprising integrating computer readable code into a computing system, in which the code in combination with the computing system is capable of performing one or more aspects of the present invention.

    [0064] As yet a further aspect of the present invention, a process for integrating computing infrastructure comprising integrating computer readable code into a computer system may be provided. The computer system comprises a computer readable medium, in which the computer medium comprises one or more aspects of the present invention. The code in combination with the computer system is capable of performing one or more aspects of the present invention.

    [0065] Although various embodiments are described above, these are only examples. For example, computing environments of other architectures can incorporate and use one or more aspects of the present invention. Further, in other embodiments, the environment may not be an emulated environment. Additionally, the units of memory can be other than pages; and/or traversal can include other mechanisms than page walks or hashtable lookups. Moreover, the addresses can be other than virtual or absolute addresses, and the data is not limited to operands. Many other variations also exits.

    [0066] Many types of computing environments can benefit from one or more aspects of the present invention. As an example, a data processing system suitable for storing and/or executing program code is usable that includes at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements include, for instance, local memory employed during actual execution of the program code, bulk storage, and cache memory which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.

    [0067] Input/Output or I/O devices (including, but not limited to, keyboards, displays, pointing devices, DASD, tape, CDs, DVDs, thumb drives and other memory media, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems, and Ethernet cards are just a few of the available types of network adapters.

    [0068] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising", when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.

    [0069] The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below, if any, are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiment with various modifications as are suited to the particular use contemplated.


    Claims

    1. A method of facilitating memory accesses in a computing environment by executing (506) an instruction, responsive to fetching (502) the instruction from a location determined by an absolute address of a page of memory, the method comprising the steps of:

    obtaining a first address to be used in a memory access by the instruction, said first address being an address of data and indirectly usable in the memory access, the data comprising an operand;

    determining whether address translation is to be omitted for the first address, wherein address translation includes at least one of a traversal of one or more data structures to locate therein a translated address for the first address or having the translated address in a translation cache that includes one or more translated addresses, by inquiring (506) into whether the instruction is marked (506) with a translation omission mark indicating that it is likely the instruction and the data are within a same unit of memory, and in response to the inquiring indicating that the translation omission mark is present, checking (510) whether the operand to be accessed by the instruction is in the same unit of memory as the instruction, the checking comprising comparing a portion of an instruction address of the instruction with a portion of the first address of the operand, wherein responsive to the two compared portions being equal, the instruction and the operand are determined to be in the same unit of memory and therefore, address translation is to be omitted;

    in response to determining that address translation is to be omitted, generating (512), based on an absolute address of the instruction, a second address for the operand directly usable in the memory access, said second address corresponding to the first address and being unequal to the first address; and

    deciding whether access checking is to be omitted for the data to be accessed using the generated second address, wherein the deciding comprises inquiring (514) into whether the instruction to access the data has present in the instruction an access check mark indicating that the access checking can be omitted, wherein access checking is omitted, in response to the inquiring indicating the access check mark is present in the instruction.


     
    2. The method of claim 1, wherein the first address is a virtual address and the second address is an absolute address corresponding to the virtual address.
     
    3. The method of claim 2, wherein the absolute address is an absolute address for data, and wherein the generating comprises generating the absolute address of the data based on an absolute address of an instruction to access the data.
     
    4. The method of claim 1, wherein the method further comprises deciding whether access checking is to be omitted for data to be accessed using the second address.
     
    5. The method of claim 4, wherein the deciding comprises inquiring (514) into whether an instruction to access the data has an indicator indicating that the access checking can be omitted, wherein access checking is omitted, in response to the inquiring indicating the indicator is present.
     
    6. The method of claim 4, wherein the method further comprises including in an instruction to access the data at least one of an access checking indicator usable in deciding whether access checking is to be omitted for the data and a translation omission indicator usable in determining whether address translation is to be omitted for the data, wherein the instruction is included in a sequence of guest code, and wherein compiling of the sequence of guest code produces native code that omits address translation, in response to the translation omission indicator being present and a determination that the data is within a same unit of memory as the instruction, and omits access checking, in response to the access checking indicator being present.
     
    7. The method of claim 1, wherein the first address is a virtual address and the second address is an absolute address of the operand, and the step of generating the second address comprises using an offset with the absolute address of the instruction, wherein the offset comprises a virtual address of the instruction minus the virtual address of the operand.
     
    8. The method of claim 7 wherein the generating comprises one of adding the offset to the absolute address of the instruction or subtracting the offset from the absolute address of the instruction to obtain the absolute address of the operand.
     
    9. The method of claim 1, wherein the determining is performed by an emulator (210) executing within a processor (102).
     
    10. A system comprising means adapted for carrying out all the steps of the method according to any preceding method claim.
     
    11. A computer program comprising instructions for carrying out all the steps of the method according to any preceding method claim, when said computer program is executed on a computer system.
     


    Ansprüche

    1. Verfahren zum Ermöglichen eines Speicherzugriffs in einer Datenverarbeitungsumgebung durch Ausführen (506) einer Anweisung in Reaktion auf ein Abrufen (502) der Anweisung aus einer Speicherstelle, die durch eine absolute Adresse einer Speicherseite festgelegt ist, wobei das Verfahren die Schritte umfasst:

    Erhalten einer ersten Adresse zum Verwenden bei einem Speicherzugriff durch die Anweisung, wobei es sich bei der ersten Adresse um eine Datenadresse handelt, die indirekt bei dem Speicherzugriff verwendbar ist, wobei die Daten einen Operanden umfassen;

    Ermitteln, ob die Adressübersetzung für die erste Adresse entfällt, wobei die Adressübersetzung ein Durchsuchen einer oder mehrerer Datenstrukturen zum Finden einer übersetzten Adresse darin für die erste Adresse oder das Aufbewahren der übersetzten Adresse in einem Übersetzungs-Zwischenspeicher beinhaltet, der eine oder mehrere übersetzte Adressen beinhaltet, wobei das Festlegen durch Abfragen (506) ausgeführt wird, ob die Anweisung (506) mit einem Übersetzungs-Entfallkennzeichner markiert ist, der anzeigt, dass sich die Anweisung und die Daten wahrscheinlich in derselben Speichereinheit befinden, und in Reaktion darauf, dass das Abfragen den vorhandenen Übersetzungs-Entfallkennzeichner anzeigt, Prüfen (510), ob sich der Operand, auf den durch die Anweisung zugegriffen wird, in derselben Speichereinheit wie die Anweisung befindet, wobei das Prüfen ein Vergleichen eines Teils einer Anweisungsadresse der Anweisung mit einem Teil der ersten Adresse des Operanden umfasst, wobei in Reaktion darauf, dass die zwei verglichenen Teile übereinstimmen, für die Anweisung und den Operanden festgestellt wird, dass sie sich in derselben Speichereinheit befinden und somit die Adressübersetzung entfällt;

    in Reaktion auf das Ermitteln, dass die Adressübersetzung entfällt, Erzeugen (512) einer zweiten, direkt in dem Speicherzugriff verwendbaren Adresse für den Operanden auf der Grundlage einer absoluten Anweisungsadresse, wobei die zweite Adresse zu der ersten Adresse gehört und nicht mit der ersten Adresse übereinstimmt; und

    Entscheiden, ob die Zugriffsprüfung für die Daten, auf die mit der erzeugten zweiten Adresse zugegriffen werden soll, entfällt, wobei das Entscheiden ein Abfragen (514) umfasst, ob die Anweisung zum Zugreifen auf die Daten in der Anweisung einen Zugriff-Prüfkennzeichner aufweist, der anzeigt, dass die Zugriffsprüfung verzichtbar ist, wobei die Zugriffsprüfung in Reaktion darauf entfällt, dass das Abrufen anzeigt, dass die Zugriffs-Prüfkennzeichnung in der Anweisung vorhanden ist.


     
    2. Verfahren nach Anspruch 1, wobei es sich bei der ersten Adresse um eine virtuelle Adresse und bei der zweiten Adresse um eine absolute Adresse handelt, die zu der virtuellen Adresse gehört.
     
    3. Verfahren nach Anspruch 2, wobei es sich bei der absoluten Adresse um eine absolute Datenadresse handelt und wobei der Schritt des Erzeugens ein Erzeugen der absoluten Datenadresse auf der Grundlage einer absoluten Adresse einer Anweisung zum Zugreifen auf die Daten handelt.
     
    4. Verfahren nach Anspruch 1, wobei das Verfahren ferner ein Entscheiden umfasst, ob die Zugriffsprüfung für die Daten, auf die mit der zweiten Adresse zugegriffen werden soll, entfällt.
     
    5. Verfahren nach Anspruch 4, wobei das Entscheiden ein Abfragen (514) umfasst, ob eine Anweisung zum Zugreifen auf die Daten eine Angabe aufweist, dass die Zugriffsprüfung verzichtbar ist, wobei die Zugriffsprüfung in Reaktion darauf entfällt, dass die Angabe vorhanden ist.
     
    6. Verfahren nach Anspruch 4, wobei das Verfahren ferner umfasst, dass eine Anweisung zum Zugreifen auf die Daten einen Zugriffs-Prüfkennzeichner, der zum Entscheiden verwendbar ist, ob die Zugriffsprüfung für die Daten entfällt, oder einen Übersetzungsentfall-Kennzeichner beinhaltet, der zum Ermitteln verwendbar ist, ob die Adressübersetzung für die Daten entfällt, wobei die Anweisung zu einer Gast-Code-Folge gehört, und wobei in Reaktion darauf, dass der Übersetzungsentfall-Kennzeichner vorhanden ist und ein Feststellen, dass sich die Daten in derselben Speichereinheit wie die Anweisung befinden, ein Kompilieren der Gast-Code-Folge Maschinen-Code erzeugt, der die Adressübersetzung ausspart, und in Reaktion darauf, dass der Zugriffs-Prüfkennzeichner vorhanden ist, die Zugriffsprüfung auslässt.
     
    7. Verfahren nach Anspruch 1, wobei es sich bei der ersten Adresse um eine virtuelle Adresse und bei der zweiten Adresse um eine absolute Adresse des Operanden handelt, und wobei der Schritt des Erzeugens der zweiten Adresse ein Verwenden eines Versatzes mit der absoluten Anweisungsadresse umfasst, wobei der Versatz die Differenz aus einer virtuellen Adresse der Anweisung und der virtuellen Adresse des Operanden umfasst.
     
    8. Verfahren nach Anspruch 7, wobei das Erzeugen ein Addieren des Versatzes zu der absoluten Adresse der Anweisung oder ein Subtrahieren des Versatzes von der absoluten Adresse der Anweisung umfasst, um die absolute Adresse des Operanden zu erhalten.
     
    9. Verfahren nach Anspruch 1, wobei das Ermitteln durch einen Emulator (210) durchgeführt wird, der in einem Prozessor (102) ausgeführt wird.
     
    10. System, das geeignete Mittel zum Durchführen aller Schritte des Verfahrens nach einem der vorhergehenden Verfahrensansprüche umfasst.
     
    11. Computer-Programm, das Anweisungen zum Durchführen aller Schritte des Verfahrens nach einem der vorhergehenden Verfahrensansprüche umfasst, wenn das Computer-Programm auf einem Computer-System ausgeführt wird.
     


    Revendications

    1. Procédé pour faciliter des accès à une mémoire dans un environnement informatique en exécutant (506) une instruction, en réponse à l'extraction (502) de l'instruction d'un emplacement déterminé par une adresse absolue d'une page de mémoire, le procédé comprenant les étapes consistant à :

    obtenir une première adresse à utiliser dans un accès à une mémoire par l'instruction, ladite première adresse étant une adresse de données et utilisable indirectement dans l'accès à une mémoire, les données comprenant un opérande ;

    déterminer si une traduction d'adresse doit être omise pour la première adresse, dans lequel une traduction d'adresse inclut une traversée d'une ou plusieurs structures de données pour localiser dans celles-ci une adresse traduite pour la première adresse et/ou ayant l'adresse traduite dans une mémoire cache de traduction qui inclut une ou plusieurs adresses traduites, en interrogeant (506) si l'instruction est marquée (506) avec une marque d'omission de traduction indiquant qu'il est probable que l'instruction et les données soient à l'intérieur d'une même unité de mémoire, et en réponse à l'interrogation indiquant la présence de la marque d'omission de traduction, vérifier (510) si l'opérande devant être accédé par l'instruction est dans la même unité de mémoire que l'instruction, la vérification comprenant de comparer une portion d'une adresse d'instruction de l'instruction avec une portion de la première adresse de l'opérande, dans lequel en réponse aux portions comparées étant égales, l'instruction et l'opérande sont déterminés être dans la même unité de mémoire et donc, une traduction d'adresse soit être omise ;

    en réponse à la détermination que la traduction d'adresse doit être omise, générer (512), sur la base d'une adresse absolue de l'instruction, une seconde adresse pour l'opérande directement utilisable dans l'accès à une mémoire, ladite seconde adresse correspondant à la première adresse et étant inégale à la première adresse ; et

    décider si une vérification d'accès doit être omise pour les données à accéder en utilisant la seconde adresse générée, la décision comprenant d'interroger (514) si l'instruction d'accès aux données présente dans l'instruction une marque de vérification d'accès indiquant que la vérification d'accès peut être omise, une vérification d'accès étant omise, en réponse à l'interrogation indiquant que le marque de vérification d'accès est présente dans l'instruction.


     
    2. Procédé selon la revendication 1, dans lequel la première adresse est une adresse virtuelle et la seconde adresse est une adresse absolue correspondant à l'adresse virtuelle.
     
    3. Procédé selon la revendication 2, dans lequel l'adresse absolue est une adresse absolue pour des données, et dans lequel la génération comprend de générer l'adresse absolue des données sur la base d'une adresse absolue d'une instruction d'accès aux données.
     
    4. Procédé selon la revendication 1, dans lequel le procédé comprend en outre de décider si une vérification d'accès doit être omise pour des données à accéder en utilisant la seconde adresse.
     
    5. Procédé selon la revendication 4, dans lequel la décision comprend d'interroger (514) si une instruction d'accès aux données possède un indicateur indiquant que la vérification d'accès peut être omise, une vérification d'accès étant omise en réponse à l'interrogation indiquant que l'indicateur est présent.
     
    6. Procédé selon la revendication 4, dans lequel le procédé comprend en outre d'inclure dans une instruction d'accès aux données un indicateur de vérification d'accès utilisable pour décider si une vérification d'accès doit être omise pour les données et/ou un indicateur d'omission de traduction utilisable pour déterminer si une traduction d'adresse doit être omise pour les données, l'instruction étant incluse dans une séquence de code invité, et une compilation de la séquence de code invité produisant un code natif qui omet une traduction d'adresse, en réponse à la présence de l'indicateur d'omission de traduction et une détermination que les données sont à l'intérieur d'une même unité de mémoire que l'instruction, et omet une vérification d'adresse, en réponse à la présence de l'indicateur de vérification d'accès.
     
    7. Procédé selon la revendication 1, dans lequel la première adresse est une adresse virtuelle et l'étape de génération de la seconde adresse comprend d'utiliser un décalage avec l'adresse absolue de l'instruction, dans lequel le décalage comprend une adresse virtuelle de l'instruction moins l'adresse virtuelle de l'opérande.
     
    8. Procédé selon la revendication 7, dans lequel la génération comprend de l'addition du décalage à la valeur absolue de l'instruction ou la soustraction du décalage de l'adresse absolue de l'instruction afin d'obtenir l'adresse absolue de l'opérande.
     
    9. Procédé selon la revendication 1, dans lequel la détermination est effectuée par un émulateur (210) s'exécutant à l'intérieur d'un processeur (102).
     
    10. Système comprenant des moyens adaptés pour mettre en oeuvre toutes les étapes du procédé selon l'une quelconque des revendications de procédé précédentes.
     
    11. Programme informatique comprenant des instructions pour mettre en oeuvre toutes les étapes du procédé selon l'une quelconque des revendications de procédé précédentes, lorsque ledit programme informatique est exécuté sur un système informatique.
     




    Drawing





















    REFERENCES CITED IN THE DESCRIPTION



    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description