Intel’s Haswell CPU is the first core optimized for 22nm and includes a huge number of innovations for developers and users. New instructions for transactional memory, bit-manipulation, full 256-bit integer SIMD and floating point multiply-accumulate are combined in a microarchitecture that essentially doubles computational throughput and cache bandwidth. Most importantly, the microarchitecture was designed for efficiency and extends Intel’s offerings down to 10W tablets, while maintaining leadership for notebooks, desktops, servers and workstations.
We previously theorized that Intel’s TSX extensions in Haswell use the caches to provide transactional memory semantics. This article describes an alternative approach based on minimal changes to the CPU core, contrasts the advantages of the two techniques and discusses the expected implementation in Haswell.
The Ivy Bridge GPU takes advantage of Intel’s 22nm FinFET process to nearly double performance and enhance programmability with DX11 and OpenCL 1.1 support. The new scalable architecture features more powerful shader cores, distributed sampling pipelines, a high bandwidth L3 cache, tesselation and 4K resolution displays. Overall, Ivy Bridge should be the highest performance integrated GPU at launch and Intel’s first competitive graphics offering.
Our first look at Kepler focuses on architectural changes to the shader core that emphasize graphics performance and the enhanced power management. Based on our analysis of Nvidia’s 28nm GPU strategy, we project a new shader core for throughput computing products and discuss the expected features.
Intel’s upcoming Haswell microprocessors include transactional memory and hardware lock elision that are exposed through the Transactional Synchronization Extensions or TSX. In this article, I discuss TSX and predict the implementation details of Haswell’s transactional memory and expected adoption across the industry, based on my previous experience.
AMD’s new management took to the stage to highlight a new strategy and share the roadmap for 2012-2013. The executives generally came across well and there are only a few changes from the existing focus, with no major shifts. The updated server roadmap seems challenging, given the competition, but client systems should do decently and expand AMD’s footprint in mobile.
Highlights of the upcoming 2012 ISSCC include the first 22nm disclosures from Intel and several SoC papers from AMD, Cavium Networks and Oracle. Looking out further to the future, the clear focus is power consumption. There are several papers from Intel on low-power logic, one from IBM discussing 3D integration of embedded DRAM and a third from Fujitsu on system level power for the K supercomputer.
AMD’s Hot Chips presentation delved into Llano, the first mainstream Fusion product, with details and results for power management. Previous disclosures painted a poor picture, which is far from the truth. Given the older CPU and GPU designs and time-to-market pressure, the results are quite good. Llano’s power management focuses on the most important aspects and is a solid foundation for future generations that will be much more power aware and optimized.
Sandy Bridge is the first GPU tightly integrated with an x86 through a shared L3 cache. Graphics performance has doubled, thanks to new shader cores and more powerful fixed functions. Sadly, there is no OpenCL or DirectX11 support till Ivy Bridge. Multimedia is superb, with full hardware decoding and accelerated encoding exposed through an API. The new design is a huge advance, but much work remains for future generations.
AMD has a grand vision for software and physical integration of CPUs and GPUs. The first Fusion generation focused on time to market, but created a solid foundation. Llano is a surprisingly attractive mid-range and value notebook product, due to the vastly enhanced power management. Future Fusion products will upgrade the CPU, GPU and media hardware and move towards a more tightly integrated computing model.