By: Ricardo B (ricardo.b.delete@this.xxxxx.xx), May 11, 2013 5:55 pm
Room: Moderated Discussions
RichardC (tich.delete@this.pobox.com) on May 11, 2013 4:39 pm wrote:
> Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 11, 2013 8:07 am wrote:
> > SMT is not really a compromise between client vs server, but on application types.
> >
> > Modern OoO CPU cores have massive execution resources to squeeze
> > out every last inch of single thread performance.
>
> Up to a point. But if you take out the extra logic and registers
> needed to support SMT, you'd be able to clock the core a little faster.
> Maybe not *much* faster, but a little. SMT can't possibly be free.
> And some workloads don't benefit from it.
In theory yes. There always trade-offs...
>
> > However, lots of software is, by nature, very low on instruction
> > level parallelism and leaves most of those resources unused.
> > On servers, the HTML generation engines (ie, the PHP/ASP/whaever
> > interpreter) is very dependent on branches and pointer chasing.
>
> Servers have lots of parallelism because they have lots of simultaneous
> clients. Client machines don't have a lot of active threads most of
> the time. Desktops are definitely more responsive with support for
> 2 simultaneous hardware threads rather than 1; but beyond that (which
> of course you can do easily these days with a dual or quad-core without
> SMT) the benefit of more hardware threads is questionable.
... but it hasn't anything to do with clients vs servers.
Saying "SMT is for servers" is a gross oversimplification of the reality of SMT.
Ie, if you have a "server application" which is highly threaded but high also IPC application, SMT hurts.
>
> > But on clients, applications like compilers, game physics and AI, etc, also have similar issues.
>
> 99% of desktop/laptop users are not running compilers at all. As for
> gaming, that's also rather a niche, and in any case I'm skeptical about
> whether current game engines show much benefit from running on 4C/8T
> rather than 4C/4T. Latency matters a lot for gaming, and I'm not at
> all sure that 8 slow threads are better than 4 fast ones.
>
> Most desktops/laptops are most running web browsers and office apps
> (word processing, spreadsheet etc). Which don't exploit many
> threads very effectively, if at all.
>
> > For this type of applications, adding SMT to an OoO core can deliver a
> > very big performance improvement with very little area/speed overhead.
>
> I don't dispute that there are some workloads which benefit greatly
> from SMT. I just don't think many desktop/laptop systems are running
> such workloads frequently.
That line of reasoning is a slippery slope.
One could also argue that most desktop/laptops aren't running CPU constrained applications anymore.
And if you're going to argue most of these applications aren't multi-threaded, then you're also arguing against multi-core CPUs.
>
> > Which leads to an interesting situation, on the x86 world.
> > The cores with SMT from Intel are also the ones with most execution
> > resources and, by far, best single thread performance.
>
> Right. When Intel makes a big power-hungry core, then a) they target
> it at servers as well as desktops/laptops, because the high margins
> of servers are attractive, so b) they put in SMT, because it's very
> effective for server workloads. But that doesn't prove that SMT is
> the optimal choice for desktop/laptop cpu's.
Your entire line of thought it based on one wrong premise: "SMT is of use for servers but not for clients".
>
> I'm not saying the resulting chips are bad; I'm just saying that it
> would be really interesting to see what Intel's architects could
> deliver if they made a 4C/4T desktop chip without worrying about
> server workloads.
You could argue that.
But you could also argue that, since many client applications aren't multi-threaded, it would be interesting to see what Intel could do if they designed a CPU with just 1 or 2 cores under the same power budget as the current 4 core ones.
But that's a skewed view, you're missing the global picture.
This is the situation in terms of client CPUs:
1. Client CPUs need to have fat cores. They need these fat cores to provide the maximum performance on single threaded applications, which is still a very important use case for clients, more than on servers.
2. At the same time, clients have an increasing number of those fat cores, because single thread improvements have become very hard. Exploiting those multiple cores obviously requires multi-threaded software.
So, in summary, clients want CPUs with multiple fat cores and multi-threaded software to exploit them.
But lots of client (or not) software is actually low IPC and makes poor use of the fat cores.
Therefore, if you have a CPU with multiple fat cores... adding SMT is a no-brainer that will benefit more use cases than not.
Clients or servers.
> Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 11, 2013 8:07 am wrote:
> > SMT is not really a compromise between client vs server, but on application types.
> >
> > Modern OoO CPU cores have massive execution resources to squeeze
> > out every last inch of single thread performance.
>
> Up to a point. But if you take out the extra logic and registers
> needed to support SMT, you'd be able to clock the core a little faster.
> Maybe not *much* faster, but a little. SMT can't possibly be free.
> And some workloads don't benefit from it.
In theory yes. There always trade-offs...
>
> > However, lots of software is, by nature, very low on instruction
> > level parallelism and leaves most of those resources unused.
> > On servers, the HTML generation engines (ie, the PHP/ASP/whaever
> > interpreter) is very dependent on branches and pointer chasing.
>
> Servers have lots of parallelism because they have lots of simultaneous
> clients. Client machines don't have a lot of active threads most of
> the time. Desktops are definitely more responsive with support for
> 2 simultaneous hardware threads rather than 1; but beyond that (which
> of course you can do easily these days with a dual or quad-core without
> SMT) the benefit of more hardware threads is questionable.
... but it hasn't anything to do with clients vs servers.
Saying "SMT is for servers" is a gross oversimplification of the reality of SMT.
Ie, if you have a "server application" which is highly threaded but high also IPC application, SMT hurts.
>
> > But on clients, applications like compilers, game physics and AI, etc, also have similar issues.
>
> 99% of desktop/laptop users are not running compilers at all. As for
> gaming, that's also rather a niche, and in any case I'm skeptical about
> whether current game engines show much benefit from running on 4C/8T
> rather than 4C/4T. Latency matters a lot for gaming, and I'm not at
> all sure that 8 slow threads are better than 4 fast ones.
>
> Most desktops/laptops are most running web browsers and office apps
> (word processing, spreadsheet etc). Which don't exploit many
> threads very effectively, if at all.
>
> > For this type of applications, adding SMT to an OoO core can deliver a
> > very big performance improvement with very little area/speed overhead.
>
> I don't dispute that there are some workloads which benefit greatly
> from SMT. I just don't think many desktop/laptop systems are running
> such workloads frequently.
That line of reasoning is a slippery slope.
One could also argue that most desktop/laptops aren't running CPU constrained applications anymore.
And if you're going to argue most of these applications aren't multi-threaded, then you're also arguing against multi-core CPUs.
>
> > Which leads to an interesting situation, on the x86 world.
> > The cores with SMT from Intel are also the ones with most execution
> > resources and, by far, best single thread performance.
>
> Right. When Intel makes a big power-hungry core, then a) they target
> it at servers as well as desktops/laptops, because the high margins
> of servers are attractive, so b) they put in SMT, because it's very
> effective for server workloads. But that doesn't prove that SMT is
> the optimal choice for desktop/laptop cpu's.
Your entire line of thought it based on one wrong premise: "SMT is of use for servers but not for clients".
>
> I'm not saying the resulting chips are bad; I'm just saying that it
> would be really interesting to see what Intel's architects could
> deliver if they made a 4C/4T desktop chip without worrying about
> server workloads.
You could argue that.
But you could also argue that, since many client applications aren't multi-threaded, it would be interesting to see what Intel could do if they designed a CPU with just 1 or 2 cores under the same power budget as the current 4 core ones.
But that's a skewed view, you're missing the global picture.
This is the situation in terms of client CPUs:
1. Client CPUs need to have fat cores. They need these fat cores to provide the maximum performance on single threaded applications, which is still a very important use case for clients, more than on servers.
2. At the same time, clients have an increasing number of those fat cores, because single thread improvements have become very hard. Exploiting those multiple cores obviously requires multi-threaded software.
So, in summary, clients want CPUs with multiple fat cores and multi-threaded software to exploit them.
But lots of client (or not) software is actually low IPC and makes poor use of the fat cores.
Therefore, if you have a CPU with multiple fat cores... adding SMT is a no-brainer that will benefit more use cases than not.
Clients or servers.