By: Doug S (foo.delete@this.bar.bar), May 14, 2013 5:51 pm
Room: Moderated Discussions
Stubabe (Stubabe.delete@this.nospam.com) on May 14, 2013 12:09 pm wrote:
> As for die area when Intel introduced SMT in the P4 they claimed it added less than 5%
> to the die - it is likely to be far less now. Is throwing that (very slight) die area
How much more die area does adding another core add these days? More than 5%, but not a whole lot more. The GPU, shared L3 and uncore is much larger percentage of die area than it was back then. That percentage increases with each generation because GPUs are where all the transistor count growth is these days to make use of the ~doubled transistor count you get from each new process.
There's been talk for well over a decade about how "transistors are cheap", and designers are having trouble figuring out what to do with them. At first they used it to add cache, they added an L2 and eventually even an L3, but they reached of point of diminishing returns where additional cache has very little impact on performance. As Linus is always quick to remind everyone, once you reach a reasonable size, the latency of that cache matters far more than making it bigger.
After adding cache ran out of steam, they went to adding cores. Dual core quickly lead to quad core, but they found that while there was a significant difference between single core and dual core, going to quad core didn't have nearly the impact. Going beyond quad core made so little difference such CPUs are only marketed to servers, workstations and enthusiasts, despite there being no real reason why Intel couldn't cost effectively fab a mass market 8 core CPU.
Integrating the GPU became the third generation plan for using up all those free transistors, and we're still well underway on this. Each generation of integrated graphics is better than the last, creates a noticeable difference (though at this point, mostly for games) and thus there is still demand to do better in the future.
Throughout all this Intel designers have been pretty free to use up transistors to address niche cases. That's where you get the multimedia/encryption acceleration instructions, to address their various niche cases, along with SMT, which addresses the "niche" of servers. It is nice that it helps in certain cases on desktops as well, but desktops were not the target for SMT. Even if it never helped on desktops and actually cost a few percentage points it would still be in the desktop core - it would just be fused off. There are plenty of capabilities for servers only that exist fused off in Intel cores, when the die size increase from adding them is negligible, there's no reason to do the layout of those blocks twice.
At some point the GPU will be "good enough" and the only path left will be to integrate the rest of the chipset into the CPU. After that, if Moore's Law is still going, it'll be interesting to see what they do. If they're forced to admit defeat they'll have no choice but to try to employ strategies to reduce pincount so the pad limiting size gets much smaller.
> As for die area when Intel introduced SMT in the P4 they claimed it added less than 5%
> to the die - it is likely to be far less now. Is throwing that (very slight) die area
How much more die area does adding another core add these days? More than 5%, but not a whole lot more. The GPU, shared L3 and uncore is much larger percentage of die area than it was back then. That percentage increases with each generation because GPUs are where all the transistor count growth is these days to make use of the ~doubled transistor count you get from each new process.
There's been talk for well over a decade about how "transistors are cheap", and designers are having trouble figuring out what to do with them. At first they used it to add cache, they added an L2 and eventually even an L3, but they reached of point of diminishing returns where additional cache has very little impact on performance. As Linus is always quick to remind everyone, once you reach a reasonable size, the latency of that cache matters far more than making it bigger.
After adding cache ran out of steam, they went to adding cores. Dual core quickly lead to quad core, but they found that while there was a significant difference between single core and dual core, going to quad core didn't have nearly the impact. Going beyond quad core made so little difference such CPUs are only marketed to servers, workstations and enthusiasts, despite there being no real reason why Intel couldn't cost effectively fab a mass market 8 core CPU.
Integrating the GPU became the third generation plan for using up all those free transistors, and we're still well underway on this. Each generation of integrated graphics is better than the last, creates a noticeable difference (though at this point, mostly for games) and thus there is still demand to do better in the future.
Throughout all this Intel designers have been pretty free to use up transistors to address niche cases. That's where you get the multimedia/encryption acceleration instructions, to address their various niche cases, along with SMT, which addresses the "niche" of servers. It is nice that it helps in certain cases on desktops as well, but desktops were not the target for SMT. Even if it never helped on desktops and actually cost a few percentage points it would still be in the desktop core - it would just be fused off. There are plenty of capabilities for servers only that exist fused off in Intel cores, when the die size increase from adding them is negligible, there's no reason to do the layout of those blocks twice.
At some point the GPU will be "good enough" and the only path left will be to integrate the rest of the chipset into the CPU. After that, if Moore's Law is still going, it'll be interesting to see what they do. If they're forced to admit defeat they'll have no choice but to try to employ strategies to reduce pincount so the pad limiting size gets much smaller.