By: Dummond D. Slow (mental.delete@this.protozoa.us), November 18, 2020 8:52 am
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on November 18, 2020 7:44 am wrote:
> rwessel (rwessel.delete@this.yahoo.com) on November 18, 2020 7:03 am wrote:
> > Ignoring whether or not it makes sense for Apple to implement
> > SMT, there appear to be a reasonable number of
> > workloads that do work well with SMT. Commercial workloads,
> > things like database engines, which net out to quite
> > low effective IPCs compared to much of what's in SPEC. Hence IBM's fascination with SMT on POWER and Z.
>
>
> Which are exactly the loads Apple doesn't care about since
> unlike Intel and AMD they don't sell commercial servers.
>
> Which brings you back to the question that matters for Apple: How well SMT would do with
> workloads that their customers will run, especially the sort of stuff they do with phones
> and tablets (since that's what funds their leading edge core designs, not Macs)
>
> While it might not be too hard to make an argument that SMT would benefit the Mac Pro, that's well
> under 1% of Apple's unit sales of phones/tablets/Macs. Given that the area penalty for adding SMT
> to the big cores is minimal I suppose that doesn't matter - just leave it disabled on the mobile
> stuff. But you're still incurring the design/verification cost (and potential additional support
> and reputational cost for security holes that SMT opens) for that relatively small market.
>
> Moreover, do you get more "extra bang for the buck" spending power running a second thread
> on the big cores via SMT, or bringing the little cores into the mix when you want to add
> threads? Given what Andrei's figures show for the power draw of the M1's little cores,
> I'm not convinced SMT is a net power wise. If it doesn't make sense power win.
>
I'm pretty sure the power spend on SMT is going to be lower than the performance boost. No data at hand though. You could take an AMD CPU and disable SMT and compare the performance in heavily MT tasks. Their chips pretty much always try to hit their PTT because of the open-loop opportunistic clock control so the power consumption tends to always be the same under load.
If you performance will be better with SMT, it proves that SMT gives better power efficiency. I expect it to end up like that unless you cherry pick some task that has negative scaling.
>
> If so, then the only argument is that 4 big cores WITH SMT along with 4 little cores is faster
> than 4 big cores without SMT and 4 little cores. Which is obvious, but then you could add 4
> more little cores, or one more big core, to reach that level of performance without all the
> extra design/verification/security headaches of SMT. Before someone says "but that costs more
> because the die is bigger", yes, but we're talking about increasing the M1's die size maybe
> 3% - and for something that will benefit ALL MT loads, not just the ones are helped by SMT.
> rwessel (rwessel.delete@this.yahoo.com) on November 18, 2020 7:03 am wrote:
> > Ignoring whether or not it makes sense for Apple to implement
> > SMT, there appear to be a reasonable number of
> > workloads that do work well with SMT. Commercial workloads,
> > things like database engines, which net out to quite
> > low effective IPCs compared to much of what's in SPEC. Hence IBM's fascination with SMT on POWER and Z.
>
>
> Which are exactly the loads Apple doesn't care about since
> unlike Intel and AMD they don't sell commercial servers.
>
> Which brings you back to the question that matters for Apple: How well SMT would do with
> workloads that their customers will run, especially the sort of stuff they do with phones
> and tablets (since that's what funds their leading edge core designs, not Macs)
>
> While it might not be too hard to make an argument that SMT would benefit the Mac Pro, that's well
> under 1% of Apple's unit sales of phones/tablets/Macs. Given that the area penalty for adding SMT
> to the big cores is minimal I suppose that doesn't matter - just leave it disabled on the mobile
> stuff. But you're still incurring the design/verification cost (and potential additional support
> and reputational cost for security holes that SMT opens) for that relatively small market.
>
> Moreover, do you get more "extra bang for the buck" spending power running a second thread
> on the big cores via SMT, or bringing the little cores into the mix when you want to add
> threads? Given what Andrei's figures show for the power draw of the M1's little cores,
> I'm not convinced SMT is a net power wise. If it doesn't make sense power win.
>
I'm pretty sure the power spend on SMT is going to be lower than the performance boost. No data at hand though. You could take an AMD CPU and disable SMT and compare the performance in heavily MT tasks. Their chips pretty much always try to hit their PTT because of the open-loop opportunistic clock control so the power consumption tends to always be the same under load.
If you performance will be better with SMT, it proves that SMT gives better power efficiency. I expect it to end up like that unless you cherry pick some task that has negative scaling.
>
> If so, then the only argument is that 4 big cores WITH SMT along with 4 little cores is faster
> than 4 big cores without SMT and 4 little cores. Which is obvious, but then you could add 4
> more little cores, or one more big core, to reach that level of performance without all the
> extra design/verification/security headaches of SMT. Before someone says "but that costs more
> because the die is bigger", yes, but we're talking about increasing the M1's die size maybe
> 3% - and for something that will benefit ALL MT loads, not just the ones are helped by SMT.