ISSCC 2005: The CELL Microprocessor

Floating Point Capability

As described previously, the prototype CELL processor’s claim to fame is its ability to sustain a high throughput rate of floating point operations. The peak rating of 256 GFlops for the prototype CELL processor is unmatched by any other device announced to date. However, the SPE’s are designed for speed rather than accuracy, and the 8 floating point operations per cycle are single precision (SP) operations. Moreover, these SP operations are not fully IEEE754 compliant in terms of rounding modes. In particular, the SP FPU in the SPE rounds to zero. In this manner, the CELL processor reveals its roots in Sony’s Emotion Engine. Similar to the Emotion Engine, the SPE’s single precision FPU also eschewed rounding mode trivialities for speed. Unlike the Emotion Engine, the SPE contains a double precision (DP) unit. According to IBM, the SPE’s double precision unit is fully IEEE854 compliant. This improvement represents a significant capability, as it allows the SPE to handle applications that require DP arithmetic, which was not possible for the Emotion Engine.

Naturally, nothing comes for free and the cost of computation using the DP FPU is performance. Since multiple iterations of the same FPU resources are needed for each DP computation, peak throughput of DP FP computation is substantially lower than the peak throughput of SP FP computation. The estimate given by IBM at ISSCC 2005 was that the DP FP computation in the SPE has an approximate 10:1 disadvantage in terms of throughput compared to SP FP computation. Given this estimate, the peak DP FP throughput of an 8 SPE CELL processor is approximately 25~30 GFlops when the DP FP capability of the PPE is also taken into consideration. In comparison, Earth Simulator, the machine that previously held the honor as the world’s fastest supercomputer, uses a variant of NEC’s SX-5 CPU (0.15um, 500 MHz) and achieves a rating of 8 GFlops per CPU. Clearly, the CELL processor contains enough compute power to present itself as a serious competitor not only in the multimedia-entertainment industry, but also in the scientific community that covets DP FP performance. That is, if the non-trivial challenges presented by the programming model of the CELL processor can be overcome, the CELL processor may be a serious competitor in applications that its predecessor, the Emotion Engine, could not cover.

