By: --- (---.delete@this.redheron.com), June 2, 2022 1:06 pm
Room: Moderated Discussions
Eric Fink (eric.delete.delete@this.this.anon.com) on June 2, 2022 5:43 am wrote:
> Anon (no.delete@this.spam.com) on June 2, 2022 12:35 am wrote:
>
> > Apple is using TSMC 5nm while Intel is using 10nm which they call 7nm
> > and AMD uses TSMC's 7nm, and both Intel Intel and AMD supports SMT.
> >
> > So, yes, Apple achieve almost the same performance at much lower power, but because they have power
> > advantage and Intel and AMD are willing to use A LOT of extra power to get 20% single thread.
> >
> > Zen 4 will be more apples-to-Apple comparison, at least on throughput, where perf per watt is what matters.
>
> That is the reply often given but I don't find it convincing. A14/M1 is not the only product at
> 5nm and yet it's peak performance and perf/watt so far are unmatched. Notebookcheck recently did
> a series of benchmarks comparing the efficiency and performance of latest CPUs at locked TDP,
> and a 5nm Firestorm at 4W outperformed a Zen 3+ at 9.5W — that's more than 2x difference in
> efficiency, and this is in a benchmark that maximally favours x86 as it runs a suboptimal code
> path on M1. I have hard time believing that TSMC's 5nm has some kind of magical properties that
> allows a vendor to reduce the power consumption by 2x at the same performance level.
>
> Your other argument — that x86 vendors trade some of the inherent efficiency to get this extra 20% performance
> has more merit IMO, but still doesn't provide a satisfactory explanation. First of all, AMD Zen3 isn't any faster
> than Firestorm, except maybe in a handful of AVX2 SIMD throughput tests where it's higher clock lets it pull
> ahead. Intel is a bit different, since they generally seem fine with making power-hungry cores if they can get
> ahead in performance (in the test linked above Golden Cove is 20% faster than Firestorm — with a whopping 6x
> higher power consumption!). But to offset this, Intel is now adding throughtput/efficiency cores, which do exactly
> what you are talking about — trade peak performance for much lower power usage. And yet, when you compare Alder
> Lake E-cores to the P-cores, the former are around 40% slower at roundly 2.5x lower power consumption (SPEC2017,
> Anandtech). In contrast, Apple's Firestorm is around 10-15% slower than Golden Cove at average 5x lower power
> consumption. I mean, there is a gap between Intel's 10nm and TSMC's 5nm, but it just isn't that much of a gap.
> If it were just "Apple trades peak performance for better efficiency", I'd expect them to be 20-30% slower with
> 2-3x lower power consumption, but they somehow do significantly better than that.
The problem is ill-posed because there is no abstract x86 ideal to be compared with an abstract ARMv9 ideal; there are only implementations. Implementations created by companies with very different incentives. Of course one dimension of incentives is prioritizing power over GHz, but even more important is the same fight that lies behind every RWT argument -- how much do you privilege the past over the future?
The argument that's actually playing out is not x86 vs ARM, it is "reboil the ocean every two decades" vs "perpetual compatibility".
At the end of the day, once Apple passes Intel, you'll see this in full force. It will no longer matter that Apple is 1% faster than Intel's best or 30% faster because "Intel runs the apps I want, and Apple doesn't". Which may even be true -- but shows how silly is the argument and the "evidence" provided for it right now.
Intel could doubtless do somewhat better if they gave up some compatibility, and a lot better if they abandoned all compatibility. But they won't do that. So...
And this means the whole package. It's not just ISA, it's memory model, it's IO model, it's cache protocol and locking primitives, it's socketed DRAMs, etc etc. Of COURSE all that stuff costs; if it didn't Apple (mostly free to use whatever they want) would be copying it instead doing something different.
Ultimately the question being asked is "could Intel produce faster chips if they changed everything while also keeping everything the same?" Well, uh, ???
> I will be also very curious to see how Zen4 performs in comparison. Given the less than
> perfect information we have available, it is really difficult to ascertain how much
> influence can be attributed to the process, to the ISA, to the design philosophy or
> maybe just the elusive "magic sauce" that individual vendors bring to the table.
>
>
> Anon (no.delete@this.spam.com) on June 2, 2022 12:35 am wrote:
>
> > Apple is using TSMC 5nm while Intel is using 10nm which they call 7nm
> > and AMD uses TSMC's 7nm, and both Intel Intel and AMD supports SMT.
> >
> > So, yes, Apple achieve almost the same performance at much lower power, but because they have power
> > advantage and Intel and AMD are willing to use A LOT of extra power to get 20% single thread.
> >
> > Zen 4 will be more apples-to-Apple comparison, at least on throughput, where perf per watt is what matters.
>
> That is the reply often given but I don't find it convincing. A14/M1 is not the only product at
> 5nm and yet it's peak performance and perf/watt so far are unmatched. Notebookcheck recently did
> a series of benchmarks comparing the efficiency and performance of latest CPUs at locked TDP,
> and a 5nm Firestorm at 4W outperformed a Zen 3+ at 9.5W — that's more than 2x difference in
> efficiency, and this is in a benchmark that maximally favours x86 as it runs a suboptimal code
> path on M1. I have hard time believing that TSMC's 5nm has some kind of magical properties that
> allows a vendor to reduce the power consumption by 2x at the same performance level.
>
> Your other argument — that x86 vendors trade some of the inherent efficiency to get this extra 20% performance
> has more merit IMO, but still doesn't provide a satisfactory explanation. First of all, AMD Zen3 isn't any faster
> than Firestorm, except maybe in a handful of AVX2 SIMD throughput tests where it's higher clock lets it pull
> ahead. Intel is a bit different, since they generally seem fine with making power-hungry cores if they can get
> ahead in performance (in the test linked above Golden Cove is 20% faster than Firestorm — with a whopping 6x
> higher power consumption!). But to offset this, Intel is now adding throughtput/efficiency cores, which do exactly
> what you are talking about — trade peak performance for much lower power usage. And yet, when you compare Alder
> Lake E-cores to the P-cores, the former are around 40% slower at roundly 2.5x lower power consumption (SPEC2017,
> Anandtech). In contrast, Apple's Firestorm is around 10-15% slower than Golden Cove at average 5x lower power
> consumption. I mean, there is a gap between Intel's 10nm and TSMC's 5nm, but it just isn't that much of a gap.
> If it were just "Apple trades peak performance for better efficiency", I'd expect them to be 20-30% slower with
> 2-3x lower power consumption, but they somehow do significantly better than that.
The problem is ill-posed because there is no abstract x86 ideal to be compared with an abstract ARMv9 ideal; there are only implementations. Implementations created by companies with very different incentives. Of course one dimension of incentives is prioritizing power over GHz, but even more important is the same fight that lies behind every RWT argument -- how much do you privilege the past over the future?
The argument that's actually playing out is not x86 vs ARM, it is "reboil the ocean every two decades" vs "perpetual compatibility".
At the end of the day, once Apple passes Intel, you'll see this in full force. It will no longer matter that Apple is 1% faster than Intel's best or 30% faster because "Intel runs the apps I want, and Apple doesn't". Which may even be true -- but shows how silly is the argument and the "evidence" provided for it right now.
Intel could doubtless do somewhat better if they gave up some compatibility, and a lot better if they abandoned all compatibility. But they won't do that. So...
And this means the whole package. It's not just ISA, it's memory model, it's IO model, it's cache protocol and locking primitives, it's socketed DRAMs, etc etc. Of COURSE all that stuff costs; if it didn't Apple (mostly free to use whatever they want) would be copying it instead doing something different.
Ultimately the question being asked is "could Intel produce faster chips if they changed everything while also keeping everything the same?" Well, uh, ???
> I will be also very curious to see how Zen4 performs in comparison. Given the less than
> perfect information we have available, it is really difficult to ascertain how much
> influence can be attributed to the process, to the ISA, to the design philosophy or
> maybe just the elusive "magic sauce" that individual vendors bring to the table.
>
>