By: Patrick Chase (patrickjchase.delete@this.gmail.com), May 17, 2013 9:43 am
Room: Moderated Discussions
RichardC (tich.delete@this.pobox.com) on May 14, 2013 10:38 am wrote:
> Patrick Chase (patrickjchase.delete@this.gmail.com) on May 13, 2013 8:33 pm wrote:
> > RichardC (tich.delete@this.pobox.com) on May 12, 2013 12:36 am wrote:
> > > So tell me what commonly-used client app (or workload) benefits from SMT ?
> >
> > 3D graphics?
> >
> > Flippancy aside, that *is* a ubiquitous client workload that
> > is best served by a heavily multithreaded throughput machine.
>
> It's best served by a rather specialized throughput machine, i.e. a GPU.
> Not an SMT multicore.
Yes, I'm well aware of that as I develop for GPUs quite a bit. Your original question was however worded quite loosely, such that it didn't actually stipulate a client workload that runs on the CPU (i.e. your question was worded such that it admitted a trivial answer, which I dutifully provided).
> Game benchmarks indicate that there's no significant
> advantage for 4C/8T over 4C/4T with most current games.
True, but completely irrelevant to your argument that gaming workloads can't benefit from TLP. There is a big difference between these two statements:
1. Current applications don't benefit from SMT
2. Client workloads don't benefit from SMT
Benchmarks of current games only address (1), and basically demonstrate that the current crop of game implementations simply don't use all that many threads. That does not however tell us whether gaming *workloads* as a class instrinsically lack thread level parallelism (TLP) as you claim. There are two scenarios that are equally consistent with the benchmarks you cite:
1. The workloads intrinsically lack TLP
2. The people who coded the current crop of game engines didn't bother to expose TLP (probably because current cores are more than fast enough to keep up with gameplay at reasonable frame rates. There is consequently little incentive to multi-thread).
I tend to believe (2) more than (1). As others have pointed out, there is known to be a decent amount of TLP in physics, AI, etc. You might view Intel's continued investment in SMT for client processors as a bet that SW developers will invest in exposing that parallelism (and their continued investment in things like TBB, IPP, CPU-side OpenCL runtimes, etc are attempts to make that easier for developers).
> Patrick Chase (patrickjchase.delete@this.gmail.com) on May 13, 2013 8:33 pm wrote:
> > RichardC (tich.delete@this.pobox.com) on May 12, 2013 12:36 am wrote:
> > > So tell me what commonly-used client app (or workload) benefits from SMT ?
> >
> > 3D graphics?
> >
> > Flippancy aside, that *is* a ubiquitous client workload that
> > is best served by a heavily multithreaded throughput machine.
>
> It's best served by a rather specialized throughput machine, i.e. a GPU.
> Not an SMT multicore.
Yes, I'm well aware of that as I develop for GPUs quite a bit. Your original question was however worded quite loosely, such that it didn't actually stipulate a client workload that runs on the CPU (i.e. your question was worded such that it admitted a trivial answer, which I dutifully provided).
> Game benchmarks indicate that there's no significant
> advantage for 4C/8T over 4C/4T with most current games.
True, but completely irrelevant to your argument that gaming workloads can't benefit from TLP. There is a big difference between these two statements:
1. Current applications don't benefit from SMT
2. Client workloads don't benefit from SMT
Benchmarks of current games only address (1), and basically demonstrate that the current crop of game implementations simply don't use all that many threads. That does not however tell us whether gaming *workloads* as a class instrinsically lack thread level parallelism (TLP) as you claim. There are two scenarios that are equally consistent with the benchmarks you cite:
1. The workloads intrinsically lack TLP
2. The people who coded the current crop of game engines didn't bother to expose TLP (probably because current cores are more than fast enough to keep up with gameplay at reasonable frame rates. There is consequently little incentive to multi-thread).
I tend to believe (2) more than (1). As others have pointed out, there is known to be a decent amount of TLP in physics, AI, etc. You might view Intel's continued investment in SMT for client processors as a bet that SW developers will invest in exposing that parallelism (and their continued investment in things like TBB, IPP, CPU-side OpenCL runtimes, etc are attempts to make that easier for developers).