By: mpx (mpx.delete@this.nomail.pl), May 12, 2013 6:13 pm
Room: Moderated Discussions
Static partitioning of some resources - bufferes, queues etc. but not execution units - among the threads was done on pre-IvyBridge Intel CPUs.
Modes is something Power 7 does - it has non-SMT, SMT2 and SMT4 modes. In SMT mode it becomes like a double number, but of lower-IPC/thread processors.
IBM Documentation:
"In SMT2 mode threads 0 and 1 share the pipelines. In SMT4 mode threads 0 and 1 share FX0, LS0, threads 2 and 3 share FX1 and LS1, threads 0 and 2 share VS0, threads 1 and 3 share VS1 and all threads share the BRX and CRL pipes."
Sparct T4 is assigns resources to threads dynamically but with the ability to activate/deactivate them. Then there's this Critical Thread API that
Oracle documentation:
"SPARC T4 is dynamically threaded. While software can activate up to eight strands on each core at a time, hardware dynamically and seamlessly allocates core resources [...] These resources are allocated among the active strands. Software activates strands by sending an interrupt to a HALTed strand. Software deactivates strands by executing a HALT instruction on each strand that is to be deactivated. [...] If software effectively
halts all strands except one on a core via Critical Thread Optimization [...] the core devotes all of its resources to the sole running strand. [...] Similarly, if software declares six out of eight strands as non-critical, the two active
strands share the core execution resources."
It just means, that if you have a piece of code that was written for high-IPC, low-thread count machine, and run it on Sparc T4, then by default each thread is going to run at somehting approaching 1/8 core resources = slow execution. You have to specifically set priorities of such process or thread at high level to activate Critical Thread Optimization - not much work, but has to be explicitly done.
Now when it comes to Sparc T4 it's the only general purpose 8-way SMT out there. It was designed for server where going high-SMT by default makes sense. Perhaps in smartphones or dekstops a high-SMT CPU should start in 1-way configuration as default, with other hardware threads only activated at the request of software?
Modes is something Power 7 does - it has non-SMT, SMT2 and SMT4 modes. In SMT mode it becomes like a double number, but of lower-IPC/thread processors.
IBM Documentation:
"In SMT2 mode threads 0 and 1 share the pipelines. In SMT4 mode threads 0 and 1 share FX0, LS0, threads 2 and 3 share FX1 and LS1, threads 0 and 2 share VS0, threads 1 and 3 share VS1 and all threads share the BRX and CRL pipes."
Sparct T4 is assigns resources to threads dynamically but with the ability to activate/deactivate them. Then there's this Critical Thread API that
Oracle documentation:
"SPARC T4 is dynamically threaded. While software can activate up to eight strands on each core at a time, hardware dynamically and seamlessly allocates core resources [...] These resources are allocated among the active strands. Software activates strands by sending an interrupt to a HALTed strand. Software deactivates strands by executing a HALT instruction on each strand that is to be deactivated. [...] If software effectively
halts all strands except one on a core via Critical Thread Optimization [...] the core devotes all of its resources to the sole running strand. [...] Similarly, if software declares six out of eight strands as non-critical, the two active
strands share the core execution resources."
It just means, that if you have a piece of code that was written for high-IPC, low-thread count machine, and run it on Sparc T4, then by default each thread is going to run at somehting approaching 1/8 core resources = slow execution. You have to specifically set priorities of such process or thread at high level to activate Critical Thread Optimization - not much work, but has to be explicitly done.
Now when it comes to Sparc T4 it's the only general purpose 8-way SMT out there. It was designed for server where going high-SMT by default makes sense. Perhaps in smartphones or dekstops a high-SMT CPU should start in 1-way configuration as default, with other hardware threads only activated at the request of software?