IBM’s mainframes are the oldest line of computers, dating back to 1964 and occupy a special place as the world’s first instruction set architecture. This longevity and extreme backwards compatibility are responsible for perhaps the most lucrative computer franchise. IBM’s z196 is the first mainframe with an out-of-order CMOS microprocessor, and also the first with an integrated L3 cache. These two innovations are largely responsible for a 30-40% improvement in performance over the previous generation z10.
With all the recent changes, AMD seems like a ship adrift at sea with no clear strategy or vision. We look at AMD and where they are likely to head in the coming years for tablets and phones and explain why they will stick with x86, rather than embrace ARM as some have suggested.
Highlights of the upcoming 2012 ISSCC include the first 22nm disclosures from Intel and several SoC papers from AMD, Cavium Networks and Oracle. Looking out further to the future, the clear focus is power consumption. There are several papers from Intel on low-power logic, one from IBM discussing 3D integration of embedded DRAM and a third from Fujitsu on system level power for the K supercomputer.
Nvidia’s Kal-El sports a novel 5th ‘companion’ core to lower idle power. We look at the trade-offs and benefits to this approach and explain why it will be a strong tablet SoC, but only an incremental gain for smartphones.
AMD’s Hot Chips presentation delved into Llano, the first mainstream Fusion product, with details and results for power management. Previous disclosures painted a poor picture, which is far from the truth. Given the older CPU and GPU designs and time-to-market pressure, the results are quite good. Llano’s power management focuses on the most important aspects and is a solid foundation for future generations that will be much more power aware and optimized.
Sandy Bridge is the first GPU tightly integrated with an x86 through a shared L3 cache. Graphics performance has doubled, thanks to new shader cores and more powerful fixed functions. Sadly, there is no OpenCL or DirectX11 support till Ivy Bridge. Multimedia is superb, with full hardware decoding and accelerated encoding exposed through an API. The new design is a huge advance, but much work remains for future generations.
Intel’s Sandy Bridge-EP arrives late this year to take on AMD’s Bulldozer in 2 and 4-socket servers. It offers up to 8 cores with a new system architecture including 20MB L3 cache, 4 DDR3 memory controllers and faster 8GT/s QPI 1.1 links. Sandy Bridge-EP is also the first server CPU to integrate PCI-E 3.0 on-die, with up to 40 lanes – a significant bandwidth and power efficiency advantage. This article compares the system architecture and design to previous approaches and shows that Sandy Bridge-EP will be a compelling upgrade for 2-socket servers and attractive for certain 4-socket systems, particularly those with large I/O needs.
Intel’s Quick Path Interconnect (QPI) was a massive step forward over the front-side bus that was used from 1995-2008. QPI finally caught up and exceeded AMD’s HyperTransport, helping Intel retake much of the server market. The next generation QPI 1.1 was re-architected based on trends and changes in the computer industry. QPI 1.1 is an incremental improvement at the physical and logical layer, but a substantial change in the coherency protocol. Sandy Bridge-EP will be the first product to implement QPI 1.1, later this year.
AMD has a grand vision for software and physical integration of CPUs and GPUs. The first Fusion generation focused on time to market, but created a solid foundation. Llano is a surprisingly attractive mid-range and value notebook product, due to the vastly enhanced power management. Future Fusion products will upgrade the CPU, GPU and media hardware and move towards a more tightly integrated computing model.
Enthusiasts and engineers know cooling is vital; it raises frequency and dramatically lowers power by reducing CPU or GPU temperatures. The world’s fastest supercomputer shows that thermal management can increase CPU performance/watt by 20% and cooling is critical for 3D integration and Moore’s Law.