In our Sandy Bridge-EP and Romley platform review, we look at the performance and power efficiency gains for Intel’s latest server microprocessor on industry standard benchmarks including SPECcpu2006 and SPECpower_ssj2008. The results are impressive, Sandy Bridge-EP is clearly the best x86 server processor on the market, and Romley will be the platform of choice for the next 2 years.
Impressions of Kepler
Our first look at Kepler focuses on architectural changes to the shader core that emphasize graphics performance and the enhanced power management. Based on our analysis of Nvidia’s 28nm GPU strategy, we project a new shader core for throughput computing products and discuss the expected features.
Sandy Bridge-EP Launches
Sandy Bridge-EP is the first major overhaul for Intel servers since 2009, and nearly ever aspect has been enhanced. The processor pairs 8 cores with a large last level cache, DDR3 memory controller, QPI 1.1, integrated PCI-E and power management. This article provides an overview of the major features, including new I/O optimization and power capping techniques and discusses the expected impact.
Analysis of Haswell’s Transactional Memory
Intel’s upcoming Haswell microprocessors include transactional memory and hardware lock elision that are exposed through the Transactional Synchronization Extensions or TSX. In this article, I discuss TSX and predict the implementation details of Haswell’s transactional memory and expected adoption across the industry, based on my previous experience.
AMD’s Analyst Update
AMD’s new management took to the stage to highlight a new strategy and share the roadmap for 2012-2013. The executives generally came across well and there are only a few changes from the existing focus, with no major shifts. The updated server roadmap seems challenging, given the competition, but client systems should do decently and expand AMD’s footprint in mobile.
IBM z196 Mainframe Architecture
IBM’s mainframes are the oldest line of computers, dating back to 1964 and occupy a special place as the world’s first instruction set architecture. This longevity and extreme backwards compatibility are responsible for perhaps the most lucrative computer franchise. IBM’s z196 is the first mainframe with an out-of-order CMOS microprocessor, and also the first with an integrated L3 cache. These two innovations are largely responsible for a 30-40% improvement in performance over the previous generation z10.
ISSCC 2012 Preview
Highlights of the upcoming 2012 ISSCC include the first 22nm disclosures from Intel and several SoC papers from AMD, Cavium Networks and Oracle. Looking out further to the future, the clear focus is power consumption. There are several papers from Intel on low-power logic, one from IBM discussing 3D integration of embedded DRAM and a third from Fujitsu on system level power for the K supercomputer.
Sandy Bridge for Servers
Intel’s Sandy Bridge-EP arrives late this year to take on AMD’s Bulldozer in 2 and 4-socket servers. It offers up to 8 cores with a new system architecture including 20MB L3 cache, 4 DDR3 memory controllers and faster 8GT/s QPI 1.1 links. Sandy Bridge-EP is also the first server CPU to integrate PCI-E 3.0 on-die, with up to 40 lanes – a significant bandwidth and power efficiency advantage. This article compares the system architecture and design to previous approaches and shows that Sandy Bridge-EP will be a compelling upgrade for 2-socket servers and attractive for certain 4-socket systems, particularly those with large I/O needs.
Intel’s Quick Path Evolved
Intel’s Quick Path Interconnect (QPI) was a massive step forward over the front-side bus that was used from 1995-2008. QPI finally caught up and exceeded AMD’s HyperTransport, helping Intel retake much of the server market. The next generation QPI 1.1 was re-architected based on trends and changes in the computer industry. QPI 1.1 is an incremental improvement at the physical and logical layer, but a substantial change in the coherency protocol. Sandy Bridge-EP will be the first product to implement QPI 1.1, later this year.
What Do Overclockers and Supercomputers Have in Common?
Enthusiasts and engineers know cooling is vital; it raises frequency and dramatically lowers power by reducing CPU or GPU temperatures. The world’s fastest supercomputer shows that thermal management can increase CPU performance/watt by 20% and cooling is critical for 3D integration and Moore’s Law.



