Sandy Bridge’s Missing SPECs
When Sandy Bridge launched earlier this year, Intel and their system partners had a plethora of benchmark numbers to accompany the new hardware. Many of the early benchmarks focused on consumer workloads for notebooks and desktops. This is reasonable, as those are the initial markets for Sandy Bridge, and it also highlights many of the processor’s greatest strengths. Without a doubt, the biggest change in the Sandy Bridge architecture is the graphics and system level integration. So naturally many of the benchmarks highlighted these new improvements.
Strangely enough, at launch there were no SPECcpu_2006 numbers for Sandy Bridge. SPECcpu is the definitive benchmark for CPU performance and has been a staple of the industry for nearly 20 years; so the omission was glaring indeed. While some vendors have avoided submitting benchmark results that do not show their products in the most flattering of light, this was not an issue. Intel’s Sandy Bridge microarchitecture is a significant improvement over the previous generation and should fare quite well against the competition. Intel indicated that their benchmarking teams did not have the time and resources to finalize a submission prior to launch, although this seems somewhat curious. Launching a new CPU design without SPECcpu numbers is a bit like showing up at a wedding in a tuxedo, but missing cuff links and studs.
Of course, after the Sandy Bridge launch the Cougar Point chipset bug was announced, which delayed hardware for a month or two. One of the great advantages of industry standard benchmarks like SPECcpu is that they strictly require that hardware be available within 3 months to avoid unfair comparisons and gimmickry. In Sandy Bridge’s case, this resulted in further delays due to the uncertainty of when the bug-free B3 chipsets would be considered ‘widely available’. Ironically, the chipset bug was in the SATA interfaces and would not even impact CPU performance in the least.
The long-awaited day, when we can judge the improvements in Sandy Bridge is nearly at hand though. Intel has finally provided SPECcpu estimates for Sandy Bridge. To be clear, these estimates are not final results, but are likely to be nearly identical to the actual SPECcpu_2006 performance numbers. One of the reasons that SPECcpu is so important is that it is a cross-platform collection of over a dozen workloads and represents a breadth that is hard to match; ranging from compiler benchmarks to scientific applications. Experienced analysts can often predict the performance of their workloads by examining some of the SPECcpu sub-tests. That being said, it does not intend to be fully comprehensive; there are other benchmarks that are needed to complement and measure performance for Java, databases, games, etc. But of CPU benchmarks, SPECcpu_2006 is generally the best and most useful for assessing the landscape – and one keenly followed by most architects.
For this article, we compare several key characteristics and estimated performance for 4 models of Sandy Bridge against a variety of existing microprocessors and platforms. For existing hardware, we exclusively rely on submissions to the SPECcpu_2006 database, selected for highest absolute performance. One frustrating difficulty is that AMD and their partners have never submitted any SPECcpu_rate results for a Phenom microprocessor, as the performance would make for an unattractive comparison. Intel has submitted Phenom tests, but it is unclear whether those are truly optimal and they should not be relied upon. AMD’s lack of participation is childish and does partners, customers and the industry a disservice – if you only submit benchmarks where your products win, then by definition those benchmarks have very little value. Even Sun has submitted results for their Niagara line, despite the exclusive focus on server throughput, rather than CPU performance. In place of AMD client processors, we have substituted nearly identical 6-core AMD server CPUs, which run slightly slower (2.8GHz vs. 3.3GHz). We also included two versions of Intel’s previous generation Westmere desktop processors, a 3.33GHz, 6-core and a 3.2GHz, 4-core design.
While intended for utterly different markets and workloads, we also included several server CPUs to give a bit more context and flavor. From the x86 camp, this includes AMD’s 2.5GHz, 12-core Opteron and Intel’s 2.26GHz, 8-core Xeon (Nehalem-EX). The 3.55GHz 8-core POWER7 is used for IBM’s smaller servers; while the 4GHz model is from a 32-socket system and has far more memory bandwidth per chip. The Niagara T3 is from a single socket system, while the 4-core Itanium is from a 2-socket HP server. Note that the performance for all these server CPUs is effectively understated by SPECcpu_2006, because they all spend considerable complexity, area and power on the coherency and RAS needed for larger systems. As a good proxy, the x86 server CPUs have roughly 20% lower per-core performance than their desktop (or small server) counterparts and the penalty is likely larger for the POWER7 and Itanium processors, which are even more focused on scalability and reliability.