By: Heikki Kultala (hkultala.delete@this.sefi.fi), May 12, 2013 1:02 am
Room: Moderated Discussions
> > But on clients, applications like compilers, game physics and AI, etc, also have similar issues.
>
> 99% of desktop/laptop users are not running compilers at all. As for
> gaming, that's also rather a niche, and in any case I'm skeptical about
> whether current game engines show much benefit from running on 4C/8T
> rather than 4C/4T. Latency matters a lot for gaming, and I'm not at
> all sure that 8 slow threads are better than 4 fast ones.
Everybody who does 3d gaming is doing compiling. The drivers of the 3d chips are compiling high-level shader code to the assembly language of the GPU.
For gaming, 4 fast ones are better than 8 slow ones. But with symmetric multi-threading, you kinda get both. When there are only 4 threads active, those execute very quickly, but when another 4 threads appear, those increase the performance.
But without SMT, doing a context switch to execute the fifth thread would cause a major slowdown.
> Most desktops/laptops are most running web browsers and office apps
> (word processing, spreadsheet etc). Which don't exploit many
> threads very effectively, if at all.
Yes. That's why small about of big cores(with multi-threading) is much better for them than larger amount of small cores.
> > For this type of applications, adding SMT to an OoO core can deliver a
> > very big performance improvement with very little area/speed overhead.
>
> I don't dispute that there are some workloads which benefit greatly
> from SMT. I just don't think many desktop/laptop systems are running
> such workloads frequently.
Often the situation might also be, that the user is having 2-core processor and executing a program which uses two threads. Then some background task needs some CPU time. Without multi-threading, a slow context witch is needed and executing of another one of the important thread halts totally. With muti-threading, it only slows down slightly.
> > Which leads to an interesting situation, on the x86 world.
> > The cores with SMT from Intel are also the ones with most execution
> > resources and, by far, best single thread performance.
>
> Right. When Intel makes a big power-hungry core, then a) they target
> it at servers as well as desktops/laptops, because the high margins
> of servers are attractive,
Wrong. For servers large amount of small cores and heavy caches are much better
> so b) they put in SMT, because it's very
> effective for server workloads. But that doesn't prove that SMT is
> the optimal choice for desktop/laptop cpu's.
No, they put in SMT because it's almost free performance imorovement, for ALL markets.
> I'm not saying the resulting chips are bad; I'm just saying that it
> would be really interesting to see what Intel's architects could
> deliver if they made a 4C/4T desktop chip without worrying about
> server workloads.
They would get a chip that has like 2% better single-thread performance, but 25% worse multi-threaded peformance.
Big core with SMT is the best solution for workloads which are mainly single-threaded but sometimes multi-threaded.