My favorite paper from the ISSCC processor session describes an adaptive clocking technique implemented in AMD’s 28nm Steamroller core that compensates for power supply noise. Initial results show a 10-20% decrease in power consumption from reducing the voltage, with no loss in performance. This elegant technique is likely to be adopted across AMD’s entire product line including GPUs, x86 CPUs, ARM-based CPUs, and other critical blocks in highly integrated SoCs.
Jaguar is AMD’s first 28nm processor, a compact 3.1mm2 design that targets 2-25W devices. It is a derivative of the earlier 40nm Bobcat, a fully out-of-order two issue design, with significant improvements in instruction set architecture and implementation. Some of the highlights include support for AVX, wider 128-bit datapaths, and a higher performance L2 cache. Jaguar is already shipping in several AMD SoCs targeted at tablets, notebooks, microservers, and desktops. However, it is far more prominent as the CPU powering the Sony Playstation 4 and Microsoft Xbox One.
The server market is at a potential inflection point, with a new breed of ARM-based microserver vendors challenging the status quo, particularly for cloud computing. We survey 20 modern processors to understand the options for alternative architectures. To achieve disruptive performance, microserver vendors must deeply specialize in particular workloads. However, there is a trade-off between differentiation and market breadth. As the handful of microserver startups are culled to 1-2 viable vendors, only the companies which deliver compelling advantages to significant markets will survive.
New compute efficiency data shows GPUs with a clear edge over CPUs, but the gap is narrowing as CPUs adopt wide vectors (e.g. AVX). Surprisingly, a throughput CPU is the most energy efficient processor, offering hope for future architectures. Our data also shows some advantages of AMD’s Bulldozer, and the overhead associated with highly scalable server CPUs.
AMD’s new management took to the stage to highlight a new strategy and share the roadmap for 2012-2013. The executives generally came across well and there are only a few changes from the existing focus, with no major shifts. The updated server roadmap seems challenging, given the competition, but client systems should do decently and expand AMD’s footprint in mobile.
With all the recent changes, AMD seems like a ship adrift at sea with no clear strategy or vision. We look at AMD and where they are likely to head in the coming years for tablets and phones and explain why they will stick with x86, rather than embrace ARM as some have suggested.
Highlights of the upcoming 2012 ISSCC include the first 22nm disclosures from Intel and several SoC papers from AMD, Cavium Networks and Oracle. Looking out further to the future, the clear focus is power consumption. There are several papers from Intel on low-power logic, one from IBM discussing 3D integration of embedded DRAM and a third from Fujitsu on system level power for the K supercomputer.
AMD’s Hot Chips presentation delved into Llano, the first mainstream Fusion product, with details and results for power management. Previous disclosures painted a poor picture, which is far from the truth. Given the older CPU and GPU designs and time-to-market pressure, the results are quite good. Llano’s power management focuses on the most important aspects and is a solid foundation for future generations that will be much more power aware and optimized.
AMD has a grand vision for software and physical integration of CPUs and GPUs. The first Fusion generation focused on time to market, but created a solid foundation. Llano is a surprisingly attractive mid-range and value notebook product, due to the vastly enhanced power management. Future Fusion products will upgrade the CPU, GPU and media hardware and move towards a more tightly integrated computing model.
Modern graphics processors are incredibly complex, but understanding their performance is essential, as they become an increasingly important component of computer systems. In this report, we use a set of benchmark results to build accurate performance models for AMD and Nvidia GPUs. We verify that our model can predict performance within roughly 6-8% for many desktop graphics cards and show how Nvidia’s microarchitecture and drivers achieve roughly 2X higher utilization than AMD’s VLIW5 design.