Incredible Claims From Elbrus
The most controversial aspect of Elbrus’s paper processor design is the performance and device characteristics claimed on behalf of a theoretical implementation in an unnamed 0.18 um, six metal layer CMOS process. Elbrus claims that the E2k can be implemented in a 126 mm2 device incorporating 28 million transistors and dissipating 35 Watts at 1.2 GHz. The performance at 1.2 GHz is claimed to be a mind-boggling 135 SPECint95 and 350 SPECfp95. To put this into proper perspective, the highest estimated performance for a 0.18 um processor claimed by anyone else is 60 SPECint95 and 110 SPECfp95 at 1.0 GHz, for the Alpha processor, disclosed by Compaq at ISSCC 2000. Perhaps a more accurate apple-to-apple comparison could be made between the E2k and another 0.18 um six issue EPIC processor, Merced/Itanium. Even Intel cheerleader MPR, at last account, estimates the Merced will ‘only’ achieve 50 SPECint95 and 80 SPECfp95 at a leisurely 800 MHz clock rate. So Elbrus is effectively claiming that E2k and its compilers will achieve between 2.3 and 4.3 times the performance of the two fastest MPUs disclosed in the Western world, in the same feature size process.
The datapath of the E2k is roughly equivalent to increasing the size of the EV6’s integer unit from 2 to 3 integer units per cluster, and then sticking two copies of the EV6’s two FP functional units onto each cluster. The bus length within the register file increases by (256 regs / 80 regs * 20 ports / 10 ports), or about a factor of 6.4x, while the wire length over the functional units increases by about roughly 4.5x. In addition, the fan-in of the bypass multiplexors in front of each functional unit increases by 50%. Clearly in equivalent terms, the E2k datapath is much longer and more heavily loaded than the EV6’s. In fact, one reason why Alpha designers clustered the 4 integer units in the EV6 into two physically distinct datapaths in the first place, was to avoid having buses run the length of four functional units.
The E2k runs buses over 7 major functional units (3 integer, 2 FP add, and 2 FP multiply) per cluster. The recently disclosed prototype hybrid 0.25/0.18 um EV68 was shown to operate above 1 GHz with 1.1 GHz in reach. A pure 0.18 um EV6 would likely run faster than 1.2 GHz and maybe approach 1.5 GHz. I doubt a 0.18 um E2k would ever approach 1 GHz. It is more likely to fall under the 733 to 800 MHz range targeted for the 0.18 um Merced/Itanium. Compared to the E2k, the physical design of the Merced enjoys the benefit of separate integer and FP datapaths, and separate 128 entry integer and FP register files (14 and 10 ports respectively). Reportedly Intel is having much difficulty in even achieving their 733 MHz target for Merced. The chance of E2k reaching 1.2 GHz is remote, to say the least.
Be the first to discuss this article!