By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), June 1, 2013 10:11 am
Room: Moderated Discussions
rwessel (robertwessel.delete@this.yahoo.com) on May 31, 2013 9:02 pm wrote:
> Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 31, 2013 6:59 pm wrote:
[snip]
>> Just one last little thing; why is this only done for two threads and not a variable amount
>> of threads? Was it done for some sort of balancing; that having more than two threads would
>> create way too much stress on the resources of one core? So they chose two as a middle ground?
>
>
> It depends on the expected workloads, and the rest of the CPU's design.
>
> All of Intel's x86 SMT implementations have been two thread, and I think all of the IPF ones have
> been two thread as well, although I could be misremembering Poulson's specs. OTOH, both Sun/Oracle
> (UltrasSPARC T1 with four, T2 with eight) and IBM (POWER7 - four) have implement larger numbers.
Just to quibble a bit, Itanium uses Switch-on-Event-MultiThreading and UltraSPARC T1 used Fine-Grained MultiThreading (IIRC). Both T2 and POWER7 (in SMT4 mode) "cheat" a bit in exploiting clustering/partitioning; the full resources of the core are not available to a thread.
> Supporting higher numbers is not free, each active context must (approximately) duplicate the entire
> architected state of the processor.
This is part of what makes POWER7's SMT4 mode kind of neat (Boasting: I thought of this technique independently [just over 3 years ago, though presumably several years later than IBM did].); it exploits the register file duplication used to reduce the number of read ports in order to support a doubling of the number of threads without having to double the number of register file entries. Halving the potential (but not actual) instruction-level parallelism to accomplish this can be an acceptable penalty for low ILP workloads.
> Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 31, 2013 6:59 pm wrote:
[snip]
>> Just one last little thing; why is this only done for two threads and not a variable amount
>> of threads? Was it done for some sort of balancing; that having more than two threads would
>> create way too much stress on the resources of one core? So they chose two as a middle ground?
>
>
> It depends on the expected workloads, and the rest of the CPU's design.
>
> All of Intel's x86 SMT implementations have been two thread, and I think all of the IPF ones have
> been two thread as well, although I could be misremembering Poulson's specs. OTOH, both Sun/Oracle
> (UltrasSPARC T1 with four, T2 with eight) and IBM (POWER7 - four) have implement larger numbers.
Just to quibble a bit, Itanium uses Switch-on-Event-MultiThreading and UltraSPARC T1 used Fine-Grained MultiThreading (IIRC). Both T2 and POWER7 (in SMT4 mode) "cheat" a bit in exploiting clustering/partitioning; the full resources of the core are not available to a thread.
> Supporting higher numbers is not free, each active context must (approximately) duplicate the entire
> architected state of the processor.
This is part of what makes POWER7's SMT4 mode kind of neat (Boasting: I thought of this technique independently [just over 3 years ago, though presumably several years later than IBM did].); it exploits the register file duplication used to reduce the number of read ports in order to support a doubling of the number of threads without having to double the number of register file entries. Halving the potential (but not actual) instruction-level parallelism to accomplish this can be an acceptable penalty for low ILP workloads.