Jump to content

Long mode: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Current Limits: "consequentially" makes it a non-sequitur
Line 18: Line 18:
For example, when the K8 is calculating an address in its AGU (address generation unit) and storing the result back in a register, there operations are fully 64 bit. Just when it comes to invoking addresses, i.e., loading or storing data in/from memory (or cache), these (current) limitations come in. The current CPUs may only use the "lower" addresses of the huge 64 bit address space. This is both true for the virtual and the physical limit. For example, if software tries to access an address which is equal to or larger than 2<sup>48</sup> on a K8 or K10, a general protection (GP) fault is raised by the CPU. This prevents the misuse of the upper 16 bits of an address as a place to store data (if the address is smaller than 2<sup>48</sup>, the topmost 16 bits are not used / always zero). Otherwise if a new microarchitecture was released which increases the size of the virtual address space beyond 48 bits, software which runs on current microarchitectures might stop working on these new machines.
For example, when the K8 is calculating an address in its AGU (address generation unit) and storing the result back in a register, there operations are fully 64 bit. Just when it comes to invoking addresses, i.e., loading or storing data in/from memory (or cache), these (current) limitations come in. The current CPUs may only use the "lower" addresses of the huge 64 bit address space. This is both true for the virtual and the physical limit. For example, if software tries to access an address which is equal to or larger than 2<sup>48</sup> on a K8 or K10, a general protection (GP) fault is raised by the CPU. This prevents the misuse of the upper 16 bits of an address as a place to store data (if the address is smaller than 2<sup>48</sup>, the topmost 16 bits are not used / always zero). Otherwise if a new microarchitecture was released which increases the size of the virtual address space beyond 48 bits, software which runs on current microarchitectures might stop working on these new machines.


The physical memory limitation is a limitation on how much RAM you can install in the machine. On a cc-NUMA multiprocessor system (Opteron) this includes the memory which is installed in the remote nodes, because the CPUs can directly address (and cache) all memory regardless if it is on the home node or remote. The 1 TiB limit (40bit) for physical memory for the K8 is still huge, but might have been a limit for use in supercomputers. Consequentially, the new K10 microarchitecture has a 48 bit physical memory limit (256 TiB).
The physical memory limitation is a limitation on how much RAM you can install in the machine. On a cc-NUMA multiprocessor system (Opteron) this includes the memory which is installed in the remote nodes, because the CPUs can directly address (and cache) all memory regardless if it is on the home node or remote. The 1 TiB limit (40bit) for physical memory for the K8 is still huge, but might have been a limit for use in supercomputers. Consequently, the new K10 microarchitecture has a 48 bit physical memory limit (256 TiB).


When there is need, the microarchitecture can be expanded step by step without side-effects from software and simultaneously save cost with its implementation. For future expansion, the architecture supports expanding memory addressing to 56 bits (limited by the page table entry format), which would allow for the processor to access 2<sup>56</sup> Bytes, or 64 [[pebibyte]]s.
When there is need, the microarchitecture can be expanded step by step without side-effects from software and simultaneously save cost with its implementation. For future expansion, the architecture supports expanding memory addressing to 56 bits (limited by the page table entry format), which would allow for the processor to access 2<sup>56</sup> Bytes, or 64 [[pebibyte]]s.

Revision as of 16:25, 30 December 2008

In the x86-64 computer architecture, long mode is the mode where a 64-bit application (or operating system) can access the 64-bit instructions and registers, while 32-bit and 16-bit programs are executed in a compatibility sub-mode.

Overview

An x86-64 processor acts identically as an IA-32 processor when running in real mode or protected mode, which are supported sub-modes when the processor is not in long mode.

A bit in the CPUID extended attributes field informs programs in real or protected modes if the processor can go to long mode, which allows a program to detect an x86-64 processor. This is similar to the CPUID attributes bit that Intel IA-64 processors use for allowing programs to detect they are running under IA-32 emulation.

Memory Limitations

While register sizes are increased to 64-bits from the previous x86 architecture, physical memory addressing has not (yet) been fully increased. This is an intended limitation of some of the current implementations of the x86-64 architecture, because there are no machines which would now benefit from 64 bit physical addressing. It would require exbibytes of RAM (millions of tebibytes) and current CPUs are not going to see these huge amounts of memory within their lifetimes. Therefore, to reduces manufacturing cost, current CPUs are limited to some less than 64 bit. Note that these are current limits which can (and are very likely to) change in the future by the release of new CPU microarchitectures. Consequentially, if the physical memory is limited, there is no use of having a full 64 bit virtual address space. Therefore, the latter is also limited to save cost. Concretely, the places in the CPU where cost can be saved are the load/store unit(s), the size of the cache tags as well as the size and complexity of the MMU and TLBs.

Current Limits

The first CPUs implementing the x86-64 architecture, namely the AMD Athlon64 / Opteron (K8) CPUs, had 48 bit virtual and 40 bit physical addressing. From the K10 (Barcelona etc.) on, the physical addressing has been increased to 48 bits as well. This might sound like these processors are not fully 64 bit capable. It is hard to accurately answer this claim. Note that these limitations are microarchitectural nature, i.e. they are due to the current implementation. But from an Instruction set architecture's point of view, these CPUs are pure 64 bit. All registers, data buses, ALUs etc. are fully 64 bit wide.

For example, when the K8 is calculating an address in its AGU (address generation unit) and storing the result back in a register, there operations are fully 64 bit. Just when it comes to invoking addresses, i.e., loading or storing data in/from memory (or cache), these (current) limitations come in. The current CPUs may only use the "lower" addresses of the huge 64 bit address space. This is both true for the virtual and the physical limit. For example, if software tries to access an address which is equal to or larger than 248 on a K8 or K10, a general protection (GP) fault is raised by the CPU. This prevents the misuse of the upper 16 bits of an address as a place to store data (if the address is smaller than 248, the topmost 16 bits are not used / always zero). Otherwise if a new microarchitecture was released which increases the size of the virtual address space beyond 48 bits, software which runs on current microarchitectures might stop working on these new machines.

The physical memory limitation is a limitation on how much RAM you can install in the machine. On a cc-NUMA multiprocessor system (Opteron) this includes the memory which is installed in the remote nodes, because the CPUs can directly address (and cache) all memory regardless if it is on the home node or remote. The 1 TiB limit (40bit) for physical memory for the K8 is still huge, but might have been a limit for use in supercomputers. Consequently, the new K10 microarchitecture has a 48 bit physical memory limit (256 TiB).

When there is need, the microarchitecture can be expanded step by step without side-effects from software and simultaneously save cost with its implementation. For future expansion, the architecture supports expanding memory addressing to 56 bits (limited by the page table entry format), which would allow for the processor to access 256 Bytes, or 64 pebibytes.

See also