The other significant change for Sandy Bridge-EP is the power control unit (PCU). While previous Intel server processors have focused on chip level power, Sandy Bridge-EP is particularly interesting for cloud computing and dense data centers because of the holistic system-level approach.
The PCU can target a specific limit and modulate both the socket and memory power to stay within the envelope. Incidentally, this is similar to a feature that AMD introduced with Bulldozer for a programmable TDP. At a rack or data center level, this can substantially reduce the thermal and power guard band, even with high variability workloads. Intel’s Node Manager works with the PCU to monitor the entire system and can serve as the basis for rack or data center level power management.
Within the processor, a power policy for each thread can be set by software. A 4-bit MSR (called the Energy Performance Bias) indicates the desired balance between performance, efficiency and absolute power consumption. The EPB is used by Windows 7, 2008 and Linux, although not all OSes use the full 16 levels. The EPB actually solves a particularly nasty problem with previous generations of power management. As our Westmere-EP review indicated, Windows actually disables turbo-mode for ‘balanced’ power profiles because the PCU was too aggressive and reduced efficiency in some scenarios. The Sandy Bridge-EP PCU is more intelligent and uses EPB to dynamically switch into a ‘high performance mode’ that does use turbo, without negatively impacting efficiency. From a practical standpoint, there are no efficiency reasons to disable turbo mode any more.
The PCU has much greater system visibility and control, bringing many of the attractive qualities of Sandy Bridge client processors to the server market. For example, Sandy Bridge-EP scales the voltage and frequency for the cache and ring bus in tandem with the cores, whereas previous server designs fixed the frequency and voltage for the cache and ring. Another shared technique is DVFS that incorporates the thermal behavior of the heatsink, as first described for Sandy Bridge client designs. The PCU can monitor and manage the PCI-E power separately from the rest of the chip, to ensure overall quality-of-service. Lastly, the PCU can dynamically adjust the number of phases in the processor and memory voltage regulators. This maintains high power conversion efficiency for light loads, by ensuring that each phase is operating in an optimal range.
The clearest way to understand the extent of the power management improvements in Sandy Bridge-EP is to compare the frequency scaling to previous designs. The savings from power management tend to manifest as frequency headroom in the absence of restrictions due to segmentation. To make a clear comparison, using top bin processors is the most sensible, since they are rarely constrained for marketing reasons. The E5-2690 is a 2.9GHz product, with a peak frequency of 3.8GHz when a single core is active. The closest comparable is the E7-8870, which has a base clock of 2.4GHz and can reach 2.8GHz when 1-4 cores are active. In the same process technology, Sandy Bridge-EP has roughly twice the frequency gain (30% versus 16%), which is a very substantial improvement.
Overall, Intel has claimed that the performance for Sandy Bridge-EP is between 30% and 70% higher than the fastest bin of Westmere-EP. Technical computing and floating point benchmarks tend to fall at the upper end of this range, both from the benefits of AVX and the increased memory bandwidth. For more classic database, virtualization and application server workloads, the benefits seem to fall around 40-60%. The power efficiency should be dramatically higher, due to the I/O integration, but also the more comprehensive power management. Preliminary Intel numbers suggest that power consumption should drop by around 10-20%, although that is highly system dependent.
While we cannot confirm these numbers with our own testing yet, the claims seem reasonable. Sandy Bridge-EP packs a huge number of improvements across all facets of the processor, and is the first major platform change since 2009. The scope of the improvements, and the demonstrated performance and efficiency gains for Sandy Bridge client systems certainly suggest that this will be the most significant server product for Intel since Nehalem.
AMD is already significantly trailing in the server market and this is hardly a welcome development. Sandy Bridge-EP will increase Intel’s performance advantage considerably, and the I/O integration will put AMD in an awkward position, requiring more components in the system. This is a dramatic reversal of 2003-2008, years when the Opteron’s integrated memory controllers, HyperTransport and 64-bit performance gave AMD a decisive advantage. That being said, AMD’s market share is so low that it is unlikely to fall even more. In fact, the recent acquisition of SeaMicro suggests growth opportunities. The net effect of Sandy Bridge-EP will probably shift AMD’s server successes further down the value segment of the market, rather than eroding market share.
For customers, a new and faster processor can only be a boon. Sandy Bridge-EP is a bit of a hybrid with many of the performance features and capabilities of Intel’s high-end servers; the I/O bandwidth even eclipses the high-end POWER7. The C600 chipset includes an 8-port SAS 3Gbit/s controller, with excellent software RAID and the PCI-E 3.0 lanes are perfect for volume deployments of 10GBE using DDIO for low latency. But this comes with a commensurate price tag. The best 2-socket parts will be priced at around $2,000, which is about 25% higher than Westmere-EP. To a large extent, this pricing reflects the fact that Sandy Bridge-EP is more than a conventional 2-socket server processor.
This concludes our discussion of Sandy Bridge-EP for now. In a subsequent piece, we will present our own benchmarks and continue the analysis.
Discuss (17 comments)