AMD’s Jaguar Microarchitecture
In the last five years, the classic PC and server computing market has begun to inexorably merge with mobile devices, consoles and other alternative form factors that rely on computation. This is an incredibly wide range of form factors, spanning <1W for a smart phone to 150W for high-end servers. AMD’s architects realized that this range was too wide for a single processor core to address the entire market, thus spawning a strategy that relied on the Bulldozer architecture at the high-end, and the Bobcat and Jaguar cores (collectively the cat cores) for lower power devices.
The math is fairly straight forward. A foundry CMOS process technology can efficiently operate digital logic over a fairly narrow voltage range, optimistically a factor of 2 (in reality it is closer to 1.6X) – which ultimately translates into a power range of roughly 8X. Adjusting the core count up and down buys perhaps another factor of 4X. Even in the most optimistic scenario, this means that AMD needs two separate cores to effectively target the low-power and high-performance markets.
But AMD saw yet another axis for differentiation. Fundamentally, AMD is the only company outside of Intel that is capable of designing, validating, and shipping x86 microprocessors. Given the massive installed software base, this is tremendously valuable. Yet Intel as a semiconductor manufacturing company is largely unwilling to use foundries, which makes integrating external IP blocks challenging. AMD was already being forced to move to foundries by the spinoff of Global Foundries, but the Bobcat and Jaguar teams had aspirations on markets where customers, not manufacturers, defined the silicon. In particular, the graphics group at AMD had a strong presence in console gaming, dating back to ArtX’s relationship with Nintendo and had already won the lion’s share of console graphics, including the Xbox360 and Wii. To take advantage of this unique opportunity, AMD’s architects designed the cat cores to be easily synthesizable and portable between foundries.
Jaguar is AMD’s first 28nm processor. It is a compact 3.1mm2 core that targets 2-25W devices, in particular tablets, microservers, and consoles. It is a clear derivative of the earlier 40nm Bobcat design, with significant improvements in instruction set architecture and implementation. Some of the highlights include support for AVX, wider 128-bit datapaths, and a higher performance L2 cache. The basic pipeline for Jaguar is shown below, in Figure 1, and clocks in at one cycle longer than Bobcat.
Like its predecessor, Jaguar is a 64-bit, out-of-order microprocessor that decodes and issues 2 instructions and dispatches 6 operations per cycle. A cluster of four Jaguar cores share an L2 cache and larger configurations can be built using an internal fabric. The actual target market for Jaguar includes tablets, and microservers, but does not extend down into smart phones. While AMD certainly could produce a <1W core, there is no motivation because the company lacks many crucial components that are necessary to be competitive. In particular, AMD has no LTE capabilities, nor do they have access to older 2G or 3G modems. However, AMD’s extensive graphics IP is incredibly valuable in nearly all client-facing systems and is a natural complement to the Jaguar processor. It is in this combination, that Jaguar has become the most successful, as it currently powers Sony’s Playstation 4 and Microsoft’s Xbox One, which are collectively selling over a million units per month.
Discuss (86 comments)