Previously, Apple’s iPhones and iPads used PowerVR GPUs from Imagination Technologies for graphics. Based on our analysis, Apple has created a custom GPU that powers the A8, A9, and 10 processors, shipping in the iPhone 6 and later models, and some iPads. Using public documents, we demonstrate that the programmable shader cores inside Apple’s GPU are different from Imagination Technologies’ PowerVR and offer superior 16-bit floating-point performance and data conversion functions. We further believe that Apple has also developed a custom shader compiler and graphics driver. The proprietary design enables Apple to deliver best-in-class performance for graphics, and other tasks that use the GPU, such as image processing and machine learning.
My favorite paper from the ISSCC processor session describes an adaptive clocking technique implemented in AMD’s 28nm Steamroller core that compensates for power supply noise. Initial results show a 10-20% decrease in power consumption from reducing the voltage, with no loss in performance. This elegant technique is likely to be adopted across AMD’s entire product line including GPUs, x86 CPUs, ARM-based CPUs, and other critical blocks in highly integrated SoCs.
Silvermont is Intel’s first CPU core tailored for power efficient applications such as smartphones, tablets, and microservers. The 22nm microarchitecture features updated instruction set extensions, full out-of-order execution with a tightly coupled L2 cache, aggressive power management, and a new high performance SoC fabric. These enhancements deliver tremendous performance and frequency gains over the aging Atom core, putting Intel’s mobile strategy in a more competitive position.
Graphics is a focal point of the upcoming Haswell platform, necessitating a high bandwidth memory solution. To deliver high performance Intel is returning to the DRAM market, which it exited in 1985. The memory that ships with Haswell will be a custom embedded DRAM mounted in the package and manufactured on a variant of Intel’s 22nm process. By avoiding the commodity memory market, Intel will preserve high margins by cannibalizing discrete GPUs and dedicated graphics memory.
The iPad 3 was an influential and successful tablet, but an excellent example of an unbalanced system. In particular, the superb Retina display was not adequately matched by the GPU of the A5X, and represented a step backwards in terms of graphics capabilities. This article explores the challenges of designing innovative products given the underlying technical constraints, through the lens of the iPad 3 and its successors.
Intel’s Haswell CPU is the first core optimized for 22nm and includes a huge number of innovations for developers and users. New instructions for transactional memory, bit-manipulation, full 256-bit integer SIMD and floating point multiply-accumulate are combined in a microarchitecture that essentially doubles computational throughput and cache bandwidth. Most importantly, the microarchitecture was designed for efficiency and extends Intel’s offerings down to 10W tablets, while maintaining leadership for notebooks, desktops, servers and workstations.
Near-threshold voltage computing extends the voltage scaling associated with Moore’s Law and dramatically improves power and energy efficiency. The technology is superb for throughput, at the cost of latency, and best suited to Intel’s products for HPC and mobile graphics.
We previously theorized that Intel’s TSX extensions in Haswell use the caches to provide transactional memory semantics. This article describes an alternative approach based on minimal changes to the CPU core, contrasts the advantages of the two techniques and discusses the expected implementation in Haswell.
The new ARMv8 architecture is classically British; a clean and elegant 64-bit instruction set, with compatibility for 32-bit software. The 64-bit mode eliminates many complicated and awkward features and will foster a larger and more diverse ARM ecosystem with new licensees and applications.
The Ivy Bridge GPU takes advantage of Intel’s 22nm FinFET process to nearly double performance and enhance programmability with DX11 and OpenCL 1.1 support. The new scalable architecture features more powerful shader cores, distributed sampling pipelines, a high bandwidth L3 cache, tesselation and 4K resolution displays. Overall, Ivy Bridge should be the highest performance integrated GPU at launch and Intel’s first competitive graphics offering.