Z900: Bringing the Mainframe into the Server Age
Besides its formidable RISC-based POWER4 design, IBM also described its latest CMOS mainframe MPU. The modern IBM mainframe has its roots in the System 360, a bold technological gamble IBM took in the early 1960’s and won in grand fashion. Up to that point, essentially every new computer had its own unique instruction set. With S/360 IBM revolutionized the industry with the now well-accepted notion of a family of processors sharing the same instruction set, but varying in implementation, price, and performance. In terms of commercial success, dominance, and imitation by others, the S/360 architecture could be considered the x86 processor of its day.
The IBM mainframe ISA has had several major facelifts since its introduction 37 years ago, starting with the addition of virtual memory capabilities with the S/370 in 1972 and 31-bit addressability with the S/370-XA in 1984 (the S/360 and S/370 used 24-bit addressing in a fashion similar to the Motorola 68000). The latest major change, disclosed last October, is even more dramatic. The Z900 (briefly known as S/390) extends the S/370’s architecture to 64 bits while retaining a 32-bit compatibility mode for legacy software. In addition, S/390 supports the IEEE-754 floating-point format as well as IBM’s native base-16 format.
The first realization of the Z900 ISA is a 34m transistor device implemented on a 177 mm2 die in a 0.18 um bulk CMOS process employing seven levels of copper interconnect. Like the POWER4, the Z900 device basically implements two processors on die. But while the extra processor in POWER4 is meant to run independently and increase system throughput, the two processors in Z900 run the same code in lock step and share L1 caches and other logic. Special logic on the chip, called the check point unit, continuously compares the state and computational results of both CPUs. If a mismatch is detected the two processors are halted, the computational state of both is restored from the last saved known good state and the processors restarted. The floorplan of the Z900 is shown in Figure 2.
Figure 2 Floorplan of the IBM Z900
The entire Z900 device burns a modest 44 W of power at 1.1 GHz because the two processors are in-order, single-issue design with a basic execution pipeline 7 stages in length. Systems based on the Z900 are already shipping, albeit at a clock frequency of 770 MHz. The basic processor design is less aggressive than even many modern high-end embedded control MPUs let alone desktop and server x86 and RISC devices. However, mainframe class machines have long ceased to be competitive on a raw CPU power basis and instead are purchased for their software base, reliability, and I/O throughput.
But the mainframe CPU may yet be taken to a new level. Another IBM paper entitled “A 1.8 GHz Instruction Window Buffer” appears to lay the groundwork for an out-of-order execution, superscalar mainframe processor. The paper describes a logic block that includes register renaming logic, reservation station, and reorder buffer. It supports four way superscalar execution with up to 64 instructions in flight. It is ostensibly designed for use in a machine with three integer execution units and one FP unit. The instruction window buffer was implemented in a test chip manufactured in a 0.18 um bulk CMOS process with seven levels of copper interconnect and reportedly operates up to 1.8 GHz. A four way superscalar Z900 device with out-of-order execution well over 1.0 GHz is certainly a competitive MPU design. It is also a clear sign that IBM is very serious about retaining and even growing its relatively small but lucrative mainframe business.
Discuss (83 comments)