By: ajensen (.delete@this..), November 22, 2010 3:46 am
Room: Moderated Discussions
anon (anon@anon.com) on 11/22/10 wrote:
---------------------------
>ajensen (@.) on 11/22/10 wrote:
>---------------------------
>>Ok so there has been a lot of discussion about whether Poulson will be/is/should
>>have been a OoOE design. I'm sure everyone here agrees that the current McKinley
>>through Tukwila pipeline has its issues (or designed on the basis of the epic scrolls
>>of the dark lord him self if you will).
>>
>>So if IPF would go in the direction of a more dynamic design which is more likely?:
>>
>>1) run-ahead
>>2) advanced run-ahead (like the multipass pipeline)
>
>IPF gets pretty good IPC without OOOE, provided they can keep IPC up with a couple
>more cycles of L1 latency, it's biggest weakness is cache misses. If IBM thought
>it was worth doing with POWER6, surely it must be seriously considered for IPF?
Agreed. IMO run-ahead or even better, multipass, is the correct way to go with IA64. This will give the best performance/thread without going to full blown tomasulo OOOE, which will be bad for throughput/watt, no mater what ISA you use...
>
>>3) Scoreboarding?? Perhaps viable is less state needs to be tracked
>>4) Tomasulo engine "distributed scoreboarding"
>>5) Hiding cache misses within one thread is so 2010, fine grained MT or very dynamic SoEMT will do the trick.
>
>Massively MT to hide poor single threaded performance is soo Niagara/GPU :)
>
>Single threaded performance actually becomes more important as it gets harder to
>beat Amdahl's law with ever increasing numbers of threads. We're talking about thousands
>of threads in high end servers...
Yes agreed again! I fear that Intel will play the safe card and do something like fine grained MT, because it is the simplest path to high throughput/watt. It will be bad for many real world workloads though. Yes you said it, Amdahl...
Fine grained MT will not be a disaster the first few years, but it is not the long term solution because it will kill perf/thread.
---------------------------
>ajensen (@.) on 11/22/10 wrote:
>---------------------------
>>Ok so there has been a lot of discussion about whether Poulson will be/is/should
>>have been a OoOE design. I'm sure everyone here agrees that the current McKinley
>>through Tukwila pipeline has its issues (or designed on the basis of the epic scrolls
>>of the dark lord him self if you will).
>>
>>So if IPF would go in the direction of a more dynamic design which is more likely?:
>>
>>1) run-ahead
>>2) advanced run-ahead (like the multipass pipeline)
>
>IPF gets pretty good IPC without OOOE, provided they can keep IPC up with a couple
>more cycles of L1 latency, it's biggest weakness is cache misses. If IBM thought
>it was worth doing with POWER6, surely it must be seriously considered for IPF?
Agreed. IMO run-ahead or even better, multipass, is the correct way to go with IA64. This will give the best performance/thread without going to full blown tomasulo OOOE, which will be bad for throughput/watt, no mater what ISA you use...
>
>>3) Scoreboarding?? Perhaps viable is less state needs to be tracked
>>4) Tomasulo engine "distributed scoreboarding"
>>5) Hiding cache misses within one thread is so 2010, fine grained MT or very dynamic SoEMT will do the trick.
>
>Massively MT to hide poor single threaded performance is soo Niagara/GPU :)
>
>Single threaded performance actually becomes more important as it gets harder to
>beat Amdahl's law with ever increasing numbers of threads. We're talking about thousands
>of threads in high end servers...
Yes agreed again! I fear that Intel will play the safe card and do something like fine grained MT, because it is the simplest path to high throughput/watt. It will be bad for many real world workloads though. Yes you said it, Amdahl...
Fine grained MT will not be a disaster the first few years, but it is not the long term solution because it will kill perf/thread.