Article: AMD's Mobile Strategy
By: Kevin G (kevin.delete@this.cubitdesigns.com), January 6, 2012 9:02 am
Room: Moderated Discussions
Dan Fay (daniel.fay@gmail.com) on 1/4/12 wrote:
---------------------------
>>
>>Let's just recap the costs:
>>
>>1. More die area for L4 controller
>>2. More die area for pins
>>3. More validation for different models
>>4. External SRAM chips (how much would a 32MB SRAM cost?)
>>5. More complex packaging, lower total yields
>>6. Need to design a new snoop filter to deal with larger cache sizes in servers
>>
>>Those costs are pretty significant.
>
>What I could see as possibly more realistic is for AMD to provide a fast, local
>DRAM memory in the neighborhood of 256-512MB (perhaps something like GDDR5) on its
>processors to improve embedded GPU performance. This memory could then be repurposed
>on servers and/or HPC systems as a high-speed, local, general-purpose memory.
I figure AMD is already going in this direction. On the GPU side of things, they have one example on the market and soon two. The Xbox 360's GPU uses eDRAM in a method that fits this example. AMD is also involved in the Wii U's GPU so there is a good chance that eDRAM will make an appearance there, possibly on die with several POWER based CPU's from IBM. I would also surprise me if MS and Sony didn't utilize eDRAM in some fashion for their next generation of consoles as well.
Next generation of systems are all expected to have 1920 x 1080 as their ideal resolution. A 64 bit HDR frame buffer would be just under 16 MB and you'd also want to include the z-buffer which would be under 8 MB in this case. Such large amount in a SoC would probably run over 300 mm^2 at 32 nm but over the course of the console's life time this would be reduced as it moves to new process nodes. A massive 32 MB or 48 MB eDRAM die would also be feasible for a MCM. Packaging costs would be high for a wide parallel bus between the dies. If certain units are moved onto the eDRAM package, then high clock serial links can work as evidence in the Xbox 360 example.
I can also see AMD using eDRAM as a L4 cache in its Fusion chips for the same reasons as the console chips. AMD could also use one external eDRAM design for both a Fusion chip and a console centric chip (PowerPC based?). This would remove one of the big cost factors as you have Nintendo, Sony, or MS helping to pay for its development, testing and validation. Then again, AMD doesn't have an L3 cache on its current Fusion chips so talk of a L4 is misnamed.
L4 cache for AMD's Opterons would only make sense where they've appeared where other high end server/main frame systems have used them: on IO hub chips. For example, IBM used to incorporate a L4 cache in their Netburst based Xeon chipsets. I can see a L4 cache appearing from a similar chipset that would be aimed at 8 or more G34 sockets. The costs for such an endeavor would be taken up likely by the system vendor (IBM or Cray). The issue now would be to get one of those companies to invest into AMD's platform. If AMD needed more cache for their Opteron lineup, I'd fathom they simply go the route IBM has taken and use eDRAM for the L3.
---------------------------
>>
>>Let's just recap the costs:
>>
>>1. More die area for L4 controller
>>2. More die area for pins
>>3. More validation for different models
>>4. External SRAM chips (how much would a 32MB SRAM cost?)
>>5. More complex packaging, lower total yields
>>6. Need to design a new snoop filter to deal with larger cache sizes in servers
>>
>>Those costs are pretty significant.
>
>What I could see as possibly more realistic is for AMD to provide a fast, local
>DRAM memory in the neighborhood of 256-512MB (perhaps something like GDDR5) on its
>processors to improve embedded GPU performance. This memory could then be repurposed
>on servers and/or HPC systems as a high-speed, local, general-purpose memory.
I figure AMD is already going in this direction. On the GPU side of things, they have one example on the market and soon two. The Xbox 360's GPU uses eDRAM in a method that fits this example. AMD is also involved in the Wii U's GPU so there is a good chance that eDRAM will make an appearance there, possibly on die with several POWER based CPU's from IBM. I would also surprise me if MS and Sony didn't utilize eDRAM in some fashion for their next generation of consoles as well.
Next generation of systems are all expected to have 1920 x 1080 as their ideal resolution. A 64 bit HDR frame buffer would be just under 16 MB and you'd also want to include the z-buffer which would be under 8 MB in this case. Such large amount in a SoC would probably run over 300 mm^2 at 32 nm but over the course of the console's life time this would be reduced as it moves to new process nodes. A massive 32 MB or 48 MB eDRAM die would also be feasible for a MCM. Packaging costs would be high for a wide parallel bus between the dies. If certain units are moved onto the eDRAM package, then high clock serial links can work as evidence in the Xbox 360 example.
I can also see AMD using eDRAM as a L4 cache in its Fusion chips for the same reasons as the console chips. AMD could also use one external eDRAM design for both a Fusion chip and a console centric chip (PowerPC based?). This would remove one of the big cost factors as you have Nintendo, Sony, or MS helping to pay for its development, testing and validation. Then again, AMD doesn't have an L3 cache on its current Fusion chips so talk of a L4 is misnamed.
L4 cache for AMD's Opterons would only make sense where they've appeared where other high end server/main frame systems have used them: on IO hub chips. For example, IBM used to incorporate a L4 cache in their Netburst based Xeon chipsets. I can see a L4 cache appearing from a similar chipset that would be aimed at 8 or more G34 sockets. The costs for such an endeavor would be taken up likely by the system vendor (IBM or Cray). The issue now would be to get one of those companies to invest into AMD's platform. If AMD needed more cache for their Opteron lineup, I'd fathom they simply go the route IBM has taken and use eDRAM for the L3.