By: juanrga (nospam.delete@this.juanrga.com), August 10, 2014 5:34 am
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on August 9, 2014 1:36 pm wrote:
> Brett (ggtgp.delete@this.yahoo.com) on August 9, 2014 11:51 am wrote:
> >
> > For those that are designing 40 watt ARM64 chips there are no legacy 32 bit apps to support
> > in the laptop/desktop/server space.
>
> Christ, you people.
>
> You can't have it both ways. You try to argue that "ISA matters a lot for decoding", but
> then at the same time you try to argue that "ISA doesn't matter at all for users".
>
> The fact that there are no legacy apps is a problem, not a feature. It means that the platform
> has no testing, that applications are few and hard to find, and that applications and libraries
> are raw, untested and likely have more bugs. In this case is also means that there are no
> actual performance numbers to back up your ridiculous and unlikely claims.
>
> Who do you think buys the resulting shit?
>
> Seriously?
>
> Do you seriously believe that the alleged performance and power advantage (and yes, it very much is alleged
> - I would even go as far as call it "drug-induced hallucinations" - so far nobody has come even close to
> Intel in the space you claim is so ripe for ARM64 in either performance or price) are so massive that users
> should/would ignore the fact that the break in ISA also causes a lot of real and inevitable problems?
>
> Guys, get a f*cking grip. It's quite clear that ISA matters at the low end, because when you are counting
> transistors, decode complexity really shows up. But only crazy and deluded people think that it all that
> noticeable in the server space. Which is not to say that such crazy and deluded people don't exist - the
> failed companies that were build up around that concept certainly show that such people do exist - but
> you need to spend a few seconds asking yourself whether you want to be counted in that group.
>
> Even if ISA complexity is noticeable in that space (and as mentioned, Intel has so far a pretty damn good record
> of showing that the x86 ISA isn't a problem, since it has successfully killed off every single competing RISC/EPIC/insert-crazy-idea-here
> architecture), why the hell do you then think that nothing else matters?
>
> ARM64 isn't that magical. MIPS and alpha were there before it with pretty similar "simple
> decoding". You're repeating arguments that didn't make sense the last time around, and that
> have been soundly disproven in that thing we call "the real world" (tm) or "the market".
>
> I think ARM has a chance, but the arguments for it in this thread have been absolutely moronic. The
> performance claims about how much better ARM64 is over ARM32 seem to be based on Geekbench, for chrissake.
> Some of the other arguments have been about how relatively quickly ARM has improved, which is largely
> based on the fact that Cortex-A9 was a complete and utter disaster particularly from an uncore standpoint,
> so when people bandy about "40% improvement" and compare that to how Haswell didn't make as much of
> a difference, they are clearly not understanding how the ARM baseline was crap.
>
> Even now, people seem to think that Apple A7 is somehow a high-performance chip. It's not really
> all that impressive, and again, the whole "look at how great it is" seems to be based almost entirely
> on pure crap (geekbench). And absolutely none of that is relevant to the server market.
>
> No, if ARM has a chance, it's not because it will outperform intel server chips
> (I can pretty much guarantee it won't), it is because of other market forces.
>
> For example, there are a lot of customers who want to make sure that they have alternatives,
> and are worried about the fact that Intel is so crushingly dominant. Those customers don't
> necessarily care about ARM at all, just go back a few months to look at the POWER8 motherboard
> news etc, but that "we really worry about a monoculture" is very much a real issue.
>
> Similarly, there are a lot of chip companies that want to get a part of the market, and if you don't have
> the resources of Intel, it's really hard to compete in the x86 space. Because that complexity may not
> be a huge performance issue in the end, but it does mean that there is a fairly high bar of entry.
>
> So there are reasons for ARM to be successful. We saw it in the mobile space: the licensing
> was a big boon there, and resulted in the proliferation of infrastructure around ARM.
> But it really isn't about performance or even power. It's about other market issues.
>
> And if you really think that the decoder is the only part that is complicated, boy have I got a
> bridge to sell you. A high-performance IO subsystem isn't simple either. Yes, it's all PCIe these
> days, but go talk to hardware people about how easy it's to do high-performance PCIe, and I suspect
> they'll laugh at you. Or the memory subsystem. Or a good high-performance SMP fabric.
>
> And no, those aren't exactly "small details" in a server environment. That whole "uncore"
> that makes sure that you can efficiently receive network packets that get DMA'd to memory
> at not-cacheline-aligned boundaries? Not simple either. Or just getting multi-socket interrupt
> controllers that work well, and can both spread things out and steer things properly? Yeah,
> there's a lot of work there, both in hardware and in all that system code.
>
> And that's all just outside the core itself. Even inside the core, the instruction decoding
> is just one detail in the end. And a detail that you can tweak - like spending a lot of effort
> on the branch predictor, so your front end runs ahead better, and so that an extra cycle of
> decoding latency isn't so noticeable. Or perhaps have a separate decoded uop cache etc.
>
> And then when you make such an absolutely ridiculously huge deal about instruction
> decoding, you at the same time entirely ignore why the customer, who is the
> one paying for it all in the end, might care about backwards compatibility.
>
> Really?
>
> So I'm dead serious when I say that anybody who talks about the alleged huge advantages of simple decoding,
> but then blithely ignores the advantages of backwards compatibility, is a f*cking moron.
>
> And there's a lot of those f*cking morons in this thread. Some of them double down on their stupidity
> by claiming that it's a big advantage that you can jettison all that legacy baggage.
>
> Linus "rant over" Torvalds
>
Evidently the x86 tax is more noticeable on small phone-like cores but the tax doesn't magically vanish for big cores (only reduces the amount by a factor of about 2x or 3x). This is the reason why a 90W ARM SoC is able to offer 80--90% of the performance of a Haswell 140W Xeon. It is not because those companies have alien engineers working in the microarchitecture. It is not because they rely on an advanced SOI foundry process beyond Intel bulk strengths. It is the ISA advantage.
Intel didn't kill the old RISC guys because the "ISA doesn't matter". Intel killed them by a mere question of volume and basic market dynamics. Intel attacked all them with cheaper high-volume products and people started to buy the chip that provided a 80% of the performance at 50% cost. The small volume guys found they couldn't sustain RD costs at same rate than Intel could and were finally killed. More about this below.
You don't have to look at how poor the baseline is, but at the evolution rate and the time that takes to caught. There was an excellent paper presented at SC13 that studied such matters, it won a best paper award (SC13, November 17 - 21 2013, Denver, CO, USA. ACM 978-1-4503-2378-9/13/11)
The first figure in the papers plots the evolution of performance of the first vector computers against old RISC guys (MIPS, alpha, HP, SPARC,...) and against x86 (AMD/Intel). The Figure 2a compares vector vs commodity and 2b compares commodity vs ARM. The new processors always start from what you call a "crap baseline", but evolve a much faster rate (2--4x faster) and in few years caught the old processors in performance and replace them.
In fact, AMD also knows that ARM will win over x86 and this is why it is now an ARM license.
ARM is a winner because has (i) ISA advantage (efficiency), (ii) volume/cost advantage (commodity hardware), and (iii) ecosystem advantage (no monopoly, customization, and so).
Apple has shown how a dual-core 1.3GHz phone-rated TDP processor is able to compete against quad-core ~1.5GHz tablet-rated TDP processors from AMD and Intel in raw performance. And "no", the reviews were not centered around Geekbench.
> Brett (ggtgp.delete@this.yahoo.com) on August 9, 2014 11:51 am wrote:
> >
> > For those that are designing 40 watt ARM64 chips there are no legacy 32 bit apps to support
> > in the laptop/desktop/server space.
>
> Christ, you people.
>
> You can't have it both ways. You try to argue that "ISA matters a lot for decoding", but
> then at the same time you try to argue that "ISA doesn't matter at all for users".
>
> The fact that there are no legacy apps is a problem, not a feature. It means that the platform
> has no testing, that applications are few and hard to find, and that applications and libraries
> are raw, untested and likely have more bugs. In this case is also means that there are no
> actual performance numbers to back up your ridiculous and unlikely claims.
>
> Who do you think buys the resulting shit?
>
> Seriously?
>
> Do you seriously believe that the alleged performance and power advantage (and yes, it very much is alleged
> - I would even go as far as call it "drug-induced hallucinations" - so far nobody has come even close to
> Intel in the space you claim is so ripe for ARM64 in either performance or price) are so massive that users
> should/would ignore the fact that the break in ISA also causes a lot of real and inevitable problems?
>
> Guys, get a f*cking grip. It's quite clear that ISA matters at the low end, because when you are counting
> transistors, decode complexity really shows up. But only crazy and deluded people think that it all that
> noticeable in the server space. Which is not to say that such crazy and deluded people don't exist - the
> failed companies that were build up around that concept certainly show that such people do exist - but
> you need to spend a few seconds asking yourself whether you want to be counted in that group.
>
> Even if ISA complexity is noticeable in that space (and as mentioned, Intel has so far a pretty damn good record
> of showing that the x86 ISA isn't a problem, since it has successfully killed off every single competing RISC/EPIC/insert-crazy-idea-here
> architecture), why the hell do you then think that nothing else matters?
>
> ARM64 isn't that magical. MIPS and alpha were there before it with pretty similar "simple
> decoding". You're repeating arguments that didn't make sense the last time around, and that
> have been soundly disproven in that thing we call "the real world" (tm) or "the market".
>
> I think ARM has a chance, but the arguments for it in this thread have been absolutely moronic. The
> performance claims about how much better ARM64 is over ARM32 seem to be based on Geekbench, for chrissake.
> Some of the other arguments have been about how relatively quickly ARM has improved, which is largely
> based on the fact that Cortex-A9 was a complete and utter disaster particularly from an uncore standpoint,
> so when people bandy about "40% improvement" and compare that to how Haswell didn't make as much of
> a difference, they are clearly not understanding how the ARM baseline was crap.
>
> Even now, people seem to think that Apple A7 is somehow a high-performance chip. It's not really
> all that impressive, and again, the whole "look at how great it is" seems to be based almost entirely
> on pure crap (geekbench). And absolutely none of that is relevant to the server market.
>
> No, if ARM has a chance, it's not because it will outperform intel server chips
> (I can pretty much guarantee it won't), it is because of other market forces.
>
> For example, there are a lot of customers who want to make sure that they have alternatives,
> and are worried about the fact that Intel is so crushingly dominant. Those customers don't
> necessarily care about ARM at all, just go back a few months to look at the POWER8 motherboard
> news etc, but that "we really worry about a monoculture" is very much a real issue.
>
> Similarly, there are a lot of chip companies that want to get a part of the market, and if you don't have
> the resources of Intel, it's really hard to compete in the x86 space. Because that complexity may not
> be a huge performance issue in the end, but it does mean that there is a fairly high bar of entry.
>
> So there are reasons for ARM to be successful. We saw it in the mobile space: the licensing
> was a big boon there, and resulted in the proliferation of infrastructure around ARM.
> But it really isn't about performance or even power. It's about other market issues.
>
> And if you really think that the decoder is the only part that is complicated, boy have I got a
> bridge to sell you. A high-performance IO subsystem isn't simple either. Yes, it's all PCIe these
> days, but go talk to hardware people about how easy it's to do high-performance PCIe, and I suspect
> they'll laugh at you. Or the memory subsystem. Or a good high-performance SMP fabric.
>
> And no, those aren't exactly "small details" in a server environment. That whole "uncore"
> that makes sure that you can efficiently receive network packets that get DMA'd to memory
> at not-cacheline-aligned boundaries? Not simple either. Or just getting multi-socket interrupt
> controllers that work well, and can both spread things out and steer things properly? Yeah,
> there's a lot of work there, both in hardware and in all that system code.
>
> And that's all just outside the core itself. Even inside the core, the instruction decoding
> is just one detail in the end. And a detail that you can tweak - like spending a lot of effort
> on the branch predictor, so your front end runs ahead better, and so that an extra cycle of
> decoding latency isn't so noticeable. Or perhaps have a separate decoded uop cache etc.
>
> And then when you make such an absolutely ridiculously huge deal about instruction
> decoding, you at the same time entirely ignore why the customer, who is the
> one paying for it all in the end, might care about backwards compatibility.
>
> Really?
>
> So I'm dead serious when I say that anybody who talks about the alleged huge advantages of simple decoding,
> but then blithely ignores the advantages of backwards compatibility, is a f*cking moron.
>
> And there's a lot of those f*cking morons in this thread. Some of them double down on their stupidity
> by claiming that it's a big advantage that you can jettison all that legacy baggage.
>
> Linus "rant over" Torvalds
>
Evidently the x86 tax is more noticeable on small phone-like cores but the tax doesn't magically vanish for big cores (only reduces the amount by a factor of about 2x or 3x). This is the reason why a 90W ARM SoC is able to offer 80--90% of the performance of a Haswell 140W Xeon. It is not because those companies have alien engineers working in the microarchitecture. It is not because they rely on an advanced SOI foundry process beyond Intel bulk strengths. It is the ISA advantage.
Intel didn't kill the old RISC guys because the "ISA doesn't matter". Intel killed them by a mere question of volume and basic market dynamics. Intel attacked all them with cheaper high-volume products and people started to buy the chip that provided a 80% of the performance at 50% cost. The small volume guys found they couldn't sustain RD costs at same rate than Intel could and were finally killed. More about this below.
You don't have to look at how poor the baseline is, but at the evolution rate and the time that takes to caught. There was an excellent paper presented at SC13 that studied such matters, it won a best paper award (SC13, November 17 - 21 2013, Denver, CO, USA. ACM 978-1-4503-2378-9/13/11)
The first figure in the papers plots the evolution of performance of the first vector computers against old RISC guys (MIPS, alpha, HP, SPARC,...) and against x86 (AMD/Intel). The Figure 2a compares vector vs commodity and 2b compares commodity vs ARM. The new processors always start from what you call a "crap baseline", but evolve a much faster rate (2--4x faster) and in few years caught the old processors in performance and replace them.
Given the trend discussed above, it is reasonable to consider whether the same market forces that replaced vectors with RISC microprocessors, and RISC processors with x86 processors, will replace x86 processors with mobile phone processors. That makes it relevant to study the implications of this trend before it actually happens.
We are not arguing about the superior energy efficiency of mobile processors, or about fundamental energy efficiency advantages of RISC vs CISC instruction sets. We agree that energy efficiency and performance are two axes in a design space, and that currently HPC and mobile processors have different targets, but this is subject to change at any time.
We do argue that the higher volume of the mobile market makes it easier for its vendors to amortize the costs of a new design, enables multiple vendors to survive and compete, and leads to faster product evolution, lower prices, more features, and higher performance.
In fact, AMD also knows that ARM will win over x86 and this is why it is now an ARM license.
ARM is a winner because has (i) ISA advantage (efficiency), (ii) volume/cost advantage (commodity hardware), and (iii) ecosystem advantage (no monopoly, customization, and so).
Apple has shown how a dual-core 1.3GHz phone-rated TDP processor is able to compete against quad-core ~1.5GHz tablet-rated TDP processors from AMD and Intel in raw performance. And "no", the reviews were not centered around Geekbench.