Article: AMD's Mobile Strategy
By: Jouni Osmala (josmala.delete@this.cc.hut.fi), January 9, 2012 2:24 am
Room: Moderated Discussions
>>There must be some people that need more CPU performance, otherwise all PC sales in the developed world would stop.
>>
>>I'd like to clarify my suggestion of adding 128 MBytes or 256 MBytes of L4 cache
>>to the processor package. The physical size of the processor package is determined
>>by the number of pins. There is plenty of space to put 4 or 8 cache die on the
>>same substrate that the processor die is attached to. The Pentium Pro had something
>>like this years ago but it used wire bonding instead of a flip-chip attach. An
>>alternative would be to stack the cache die on top of the processor die using through
>>silicon vias. I think that is what David Kanter means by "3D integration". I bet
>>with $100 of cost in cache chips, they could sell such a processor for an extra $500.
>
>Very bad idea.
>
>Slower latency to RAM, so my program jus twould slow down and i will not use the
>CPU's as i can't afford an extra $500 for each node.
There is real option of doing the L4 cache without slowing down memory latency and there are far bigger server customers than you whose programs could potentially benefit from it. There are other programs than diep in the world, with totally different cache behaviour, thats why you shouldn't really generalize your experiences with it too much. And most importantly there are people who are spending lots of other peoples money and can afford to have higher performing CPU:s.
>>
>>I'd like to clarify my suggestion of adding 128 MBytes or 256 MBytes of L4 cache
>>to the processor package. The physical size of the processor package is determined
>>by the number of pins. There is plenty of space to put 4 or 8 cache die on the
>>same substrate that the processor die is attached to. The Pentium Pro had something
>>like this years ago but it used wire bonding instead of a flip-chip attach. An
>>alternative would be to stack the cache die on top of the processor die using through
>>silicon vias. I think that is what David Kanter means by "3D integration". I bet
>>with $100 of cost in cache chips, they could sell such a processor for an extra $500.
>
>Very bad idea.
>
>Slower latency to RAM, so my program jus twould slow down and i will not use the
>CPU's as i can't afford an extra $500 for each node.
There is real option of doing the L4 cache without slowing down memory latency and there are far bigger server customers than you whose programs could potentially benefit from it. There are other programs than diep in the world, with totally different cache behaviour, thats why you shouldn't really generalize your experiences with it too much. And most importantly there are people who are spending lots of other peoples money and can afford to have higher performing CPU:s.