By: Stubabe (Stubabe.delete@this.nospam.com), May 16, 2013 11:54 am
Room: Moderated Discussions
RichardC (tich.delete@this.pobox.com) on May 14, 2013 1:43 pm wrote:
> Stubabe (Stubabe.delete@this.nospam.com) on May 14, 2013 12:09 pm wrote:
> > The assumption here is that without SMT Intel might have invested more in single threaded performance.
> > I would say the opposite, since without throughput workloads exploiting wide chips via SMT I doubt
> > Intel's architects could have justified wide power efficient designs like Sandybridge onwards
> > since the gains would have been to small in too many cases. But with SMT they can continue to
> > throw resources at fast fat cores (as it befits more than just one set of corner cases) rather
> > than just giving us a die full of gutless cores and basically just another GPU.
>
> That's a really weird argument. You're saying that even though SMT doesn't help
> single threaded workloads, if they hadn't implemented SMT then Intel would have
> just shrugged their shoulders and given up on all ways of making cores go fast ??
>
> Look, I think Intel has done a great job the last 5 or 6 years. But they've taken
> a business decision to use a common core design across a lot of very different
> workloads. It's at least plausible that some of the results are not perfect for
> all possible uses.
>
If SMT does not benefit a desktop workload then neither will quad core be much help. How much fatter can you make an x86 core? Even Haswell with its dual 5cycle FMA units is going to struggle to find sufficient independent work to occupy them fully with only one thread. Ever since Intel got burned (possibly literally if they were ever stupid enough to touch one) by their Prescott Pentium4 CPU they have employed a design rule. They will not add anything that doesn't gain more performance than it increases power consumption. It's hard to see how that rule would have given us Haswell's 4 issue ALUs or dual FMA without SMT to fully utilise it. Both of which help do single thread performance but the rate of return is far less than SMT gives. So yes Intel gave up single thread performance long ago because it doesn't give good returns on your power budget. The faster, wider (or longer) the pipeline the more it goes to waste when you get branch mispredicts or cache misses. SMT is more a technique to hide pipeline and memory latency (memory level parallelism) than to share CPU resources. That is why Atom had it (in order suffers most from these latencies), it is why most GPUs are scheduled like barrel processors (their memory latency sucks and they lack decent results forwarding) and yes it is why server CPUs have it. But is clearly not the sole domain of server code.
But how do YOU suggest Intel use that 5% for something better? I made reasoned arguments that removing SMT is unlikely to lead to Intel increasing clocks or IPC. So really you are looking at low single percentage point gains (if anything) or adding more niche functionality (something you questionably accuse SMT of being) MOVHELLOWORLD YMM0 anyone?
> Stubabe (Stubabe.delete@this.nospam.com) on May 14, 2013 12:09 pm wrote:
> > The assumption here is that without SMT Intel might have invested more in single threaded performance.
> > I would say the opposite, since without throughput workloads exploiting wide chips via SMT I doubt
> > Intel's architects could have justified wide power efficient designs like Sandybridge onwards
> > since the gains would have been to small in too many cases. But with SMT they can continue to
> > throw resources at fast fat cores (as it befits more than just one set of corner cases) rather
> > than just giving us a die full of gutless cores and basically just another GPU.
>
> That's a really weird argument. You're saying that even though SMT doesn't help
> single threaded workloads, if they hadn't implemented SMT then Intel would have
> just shrugged their shoulders and given up on all ways of making cores go fast ??
>
> Look, I think Intel has done a great job the last 5 or 6 years. But they've taken
> a business decision to use a common core design across a lot of very different
> workloads. It's at least plausible that some of the results are not perfect for
> all possible uses.
>
If SMT does not benefit a desktop workload then neither will quad core be much help. How much fatter can you make an x86 core? Even Haswell with its dual 5cycle FMA units is going to struggle to find sufficient independent work to occupy them fully with only one thread. Ever since Intel got burned (possibly literally if they were ever stupid enough to touch one) by their Prescott Pentium4 CPU they have employed a design rule. They will not add anything that doesn't gain more performance than it increases power consumption. It's hard to see how that rule would have given us Haswell's 4 issue ALUs or dual FMA without SMT to fully utilise it. Both of which help do single thread performance but the rate of return is far less than SMT gives. So yes Intel gave up single thread performance long ago because it doesn't give good returns on your power budget. The faster, wider (or longer) the pipeline the more it goes to waste when you get branch mispredicts or cache misses. SMT is more a technique to hide pipeline and memory latency (memory level parallelism) than to share CPU resources. That is why Atom had it (in order suffers most from these latencies), it is why most GPUs are scheduled like barrel processors (their memory latency sucks and they lack decent results forwarding) and yes it is why server CPUs have it. But is clearly not the sole domain of server code.
But how do YOU suggest Intel use that 5% for something better? I made reasoned arguments that removing SMT is unlikely to lead to Intel increasing clocks or IPC. So really you are looking at low single percentage point gains (if anything) or adding more niche functionality (something you questionably accuse SMT of being) MOVHELLOWORLD YMM0 anyone?