By: Patrick Chase (patrickjchase.delete@this.gmail.com), May 20, 2013 9:26 pm
Room: Moderated Discussions
Exophase (exophase.delete@this.gmail.com) on May 20, 2013 6:06 pm wrote:
> David Kanter (dkanter.delete@this.realworldtech.com) on May 20, 2013 5:30 pm wrote:
> > That's not actually true. People watch movies on tablets, all the time.
> > Some are feature length, some are just 30 minute TV shows, and some are 0.5-5
> > minute youtube clips. Those are all fair workloads.
>
> Which is usually accelerated by dedicated decode hardware, driving CPU requirements
> down to a very small figure - even for 1080p. Except in cases where there's
> a software problem or the codec is unsupported, which is unusual.
True.
> Sure, you could say that if you need 5-10% of the CPU constantly it matters that the
> power consumption is low, but even a relatively power hungry mobile CPU will use a lot
> less power than the decode hardware, display, and (in the case of streaming video)
> wifi. Unless it has totally broken power scaling.
This one is more interesting, in that it may or may not be true depending on the relationship between static/leakage and dynamic power. On TSMC's 40 nm and initial 28 nm (non-LP/HPM) processes leakage could easily approach 50% of total dissipation. This meant that "always-on" blocks like caches and OoO state structures would dissipate significant power even doing nothing, and so idling along at 5-10% wasn't such a great thing. That's why big.LITTLE looked like such a big win, since AFAIK it was the only (mostly) SW-transparent technique that enabled you to power down your big, high-performance cores in their entirety.
If you look at 28LP/HPM (the high-K 28 nm flavors), the tradeoffs appear to change rather significantly. The leakage power is now down by an order of magnitude at constant performance, and the ratio of dynamic:static power is about 5X greater (working from memory here so these may not be exact). A big core with really good DVFS and clock-gating is therefore a vastly more competitive option, since the relative benefit of power-gating all of those transistors is about a fifth of what it used to be.
This may explain Qualcomm's focus on dynamic power reduction techniques in Krait, and their choice not to pursue big.LITTLE and n+1 designs. Given the way the process technologies have played out (and now that they're on HKMG) that seems to have been a very wise bet. Intel has been on HKMG all along and has (perhaps unsurprisingly) pursued exactly the same strategy with Saltwell and now Silvermont.
> David Kanter (dkanter.delete@this.realworldtech.com) on May 20, 2013 5:30 pm wrote:
> > That's not actually true. People watch movies on tablets, all the time.
> > Some are feature length, some are just 30 minute TV shows, and some are 0.5-5
> > minute youtube clips. Those are all fair workloads.
>
> Which is usually accelerated by dedicated decode hardware, driving CPU requirements
> down to a very small figure - even for 1080p. Except in cases where there's
> a software problem or the codec is unsupported, which is unusual.
True.
> Sure, you could say that if you need 5-10% of the CPU constantly it matters that the
> power consumption is low, but even a relatively power hungry mobile CPU will use a lot
> less power than the decode hardware, display, and (in the case of streaming video)
> wifi. Unless it has totally broken power scaling.
This one is more interesting, in that it may or may not be true depending on the relationship between static/leakage and dynamic power. On TSMC's 40 nm and initial 28 nm (non-LP/HPM) processes leakage could easily approach 50% of total dissipation. This meant that "always-on" blocks like caches and OoO state structures would dissipate significant power even doing nothing, and so idling along at 5-10% wasn't such a great thing. That's why big.LITTLE looked like such a big win, since AFAIK it was the only (mostly) SW-transparent technique that enabled you to power down your big, high-performance cores in their entirety.
If you look at 28LP/HPM (the high-K 28 nm flavors), the tradeoffs appear to change rather significantly. The leakage power is now down by an order of magnitude at constant performance, and the ratio of dynamic:static power is about 5X greater (working from memory here so these may not be exact). A big core with really good DVFS and clock-gating is therefore a vastly more competitive option, since the relative benefit of power-gating all of those transistors is about a fifth of what it used to be.
This may explain Qualcomm's focus on dynamic power reduction techniques in Krait, and their choice not to pursue big.LITTLE and n+1 designs. Given the way the process technologies have played out (and now that they're on HKMG) that seems to have been a very wise bet. Intel has been on HKMG all along and has (perhaps unsurprisingly) pursued exactly the same strategy with Saltwell and now Silvermont.