SPECulating on the Performance of the POWER6
Ultimately, the real question is how will the POWER6 perform? Unfortunately, this far from commercial release, it is very hard to say. While frequency targets for the POWER6 are known and there has been first silicon, it is impossible to say what speeds the chip will reach. Moreover, the POWER6 microarchitecture is quite different from the POWER4/5, and has yet to be fully disclosed. Worse yet, the system architecture is entirely unknown. This makes commercial benchmarks like TPC-C, SAP 2D, SPECjbb2005 nearly impossible to predict, as they are all highly dependent on the system and software used. However, SPECcpu2000 should be far less susceptible to distortion due to system architecture or software, although compilers will make a large difference.
The MPU most similar to the POWER6 is the PA6T-1682M, which is a quad issue, triple execute fully OOO core as described in a previous article. In comparison, the POWER6 is slightly wider, and has a much more aggressive cache hierarchy. The PA6T-1682M microarchitecture is optimized for a system on a chip; the L2 cache is accessed through a crossbar mechanism and has a higher latency than the POWER6 will. Based on these facts, it seems reasonable to guess that the POWER6 will have equal or slightly better IPC than the PA6T-1682M, while running at about twice the clockspeed.
Using these IPC assumptions and IBM’s stated frequency targets gives the following estimates for SPECint and SPECfp (these are base, not peak numbers):
Table 1 – Performance Estimates for the POWER6
These numbers are fairly consistent with the more vague performance claims that IBM has made. In several presentations they have claimed to roughly double performance; however claims like this leave quite a bit of maneuvering room. The more server-centric performance numbers like TPC-C and SAP 2D will have to wait until the POWER6 launch. But the most interesting open question is how will the POWER6 perform for mainframes? zSeries systems shuffled off the microprocessor performance curve a long time ago, primarily because the strength of mainframes is not in the CPU, but is in the I/O architecture, the backwards compatibility, virtualization and system architecture. If IBM were to allow code to be ported from zArch to PPC and then run on these new mainframes, it could expose a very significant performance benefit to end users.
Discuss (61 comments)