So How Fast is It?
Although the McKinley has been previously described as running as fast as 1.2 GHz, it appears that Intel and HP are downplaying clock frequency and uniformly described it as a 1.0 GHz device in all related presentations. Nevertheless from information disclosed it appears there could eventually be commercially useful yield at 1.2 GHz. This would seem to substantiate rumors that McKinley will be offered in two speed grades, 1.0 GHz and something higher, either 1.1 or 1.2 GHz. It consumes 130 W at a nominal 1.5V supply voltage even at 1.0 GHz, so the hypothetical faster speed grade device may in practice be limited to high end server applications where more costly cooling solutions can be used to accommodate proportionally higher power consumption (faster parts may require a higher supply voltage than 1.5V).
Intel and HP presenters were quite forthcoming with detailed information about specific aspects of McKinley’s microarchitectural, logical, physical, and circuit design. They even discussed how individual features improved its performance, presumably relative to Merced. However, they were entirely silent on the question of its absolute performance except for one oblique and ambiguous reference to the achievement of 0.75 SPECint2k per MHz in a figure in paper 20.6 . The same figure(20.6.2) also described the Merced as achieving 0.50 SPECint2k per MHz. This is consistent with reports that the McKinley achieves 1.5x Merced performance at the same clock frequency when running binaries compiled for Merced. This advantage is said to stretch to about 1.7x when code is recompiled with optimization specific to McKinley. This second figure is also consistent with the reported 40% power drop achieved when the McKinley enters single bundle issue mode during thermal throttling. Scaling per clock performance to 1.0 GHz gives figures of 1.9x and 2.1x higher performance respectively, values that straddle the prediction of twice Merced performance made publicly more than three years ago.
As a minor footnote, the x86 performance of McKinley was ambiguously described as “respectable”. It is not known if the IA32 decode and control block in the Merced was redesigned for McKinley. No doubt the newer MPU’s x86 performance would improve a lot as the serendipitous side effect of a faster clocked and more efficient IA64 execution core. However, the x86 performance of Merced is so notoriously poor that it seems inescapable that a complete overhaul of its IA32 compatibility architecture would be needed to bring McKinley’s x86 performance up to a level that most PC users would consider respectable. Whether that was actually done or not remains to be seen.
Be the first to discuss this article!