By: someone (someone.delete@this.somewhere.com), November 21, 2010 5:27 pm
Room: Moderated Discussions
Richard Cownie (tich@pobox.com) on 11/21/10 wrote:
---------------------------
>someone (someone@somewhere.com) on 11/21/10 wrote:
>---------------------------
>
>>Sure deep sub micron CMOS leaks. It leaks whether
>>the processor is active or stalled. Other things on the
>>chip run whether the processor is stalled or not that
>>also consume power (PLL, global clock distribution etc).
>>But there is no meaningful energy associated with a
>>stall as an architectural event unless there is a replay
>>trap etc associated with it.
>
>Of course there is. If the stall stops everything for N cycles, and
>the cpu is burning power during that time, then you definitely
>have energy usage. The stall case takes more energy than the non-stall
>case, right ?
The leakage power occurs whether the CPU is stalled
or whether it is not stalled. Therefore it cannot be
atributed to the stall.
You could try to argue that an OOOE processor has
fewer/shorter stalls and that reduces the amortized
leakage energy per stall. I would counter that the
extra complexity of OOOE means that there are a lot
more logic transistors around to leak and so leakage
power is always higher - whether stalled or not.. :-P
>
>>Yeah it sucks that modern workloads can't execute
>>entirely out of L1. Let us know if you figure out a way
>>around it.
>
>*If* you could execute entirely in L1, then static-scheduled in-order
>architectures would probably be a fine idea. Since we can't, OoO
>architectures prevail for most apps, since they cope better with
>unpredictable load latencies. So it sucks; but it sucks a lot more
>for your argument than for mine.
Huh?
We were talking about energy/power of cache misses
and stalls. Feel free to start a different thread about
performance.
---------------------------
>someone (someone@somewhere.com) on 11/21/10 wrote:
>---------------------------
>
>>Sure deep sub micron CMOS leaks. It leaks whether
>>the processor is active or stalled. Other things on the
>>chip run whether the processor is stalled or not that
>>also consume power (PLL, global clock distribution etc).
>>But there is no meaningful energy associated with a
>>stall as an architectural event unless there is a replay
>>trap etc associated with it.
>
>Of course there is. If the stall stops everything for N cycles, and
>the cpu is burning power during that time, then you definitely
>have energy usage. The stall case takes more energy than the non-stall
>case, right ?
The leakage power occurs whether the CPU is stalled
or whether it is not stalled. Therefore it cannot be
atributed to the stall.
You could try to argue that an OOOE processor has
fewer/shorter stalls and that reduces the amortized
leakage energy per stall. I would counter that the
extra complexity of OOOE means that there are a lot
more logic transistors around to leak and so leakage
power is always higher - whether stalled or not.. :-P
>
>>Yeah it sucks that modern workloads can't execute
>>entirely out of L1. Let us know if you figure out a way
>>around it.
>
>*If* you could execute entirely in L1, then static-scheduled in-order
>architectures would probably be a fine idea. Since we can't, OoO
>architectures prevail for most apps, since they cope better with
>unpredictable load latencies. So it sucks; but it sucks a lot more
>for your argument than for mine.
Huh?
We were talking about energy/power of cache misses
and stalls. Feel free to start a different thread about
performance.