By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), May 17, 2013 12:22 pm
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on May 17, 2013 8:00 am wrote:
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on May 15, 2013 5:37 pm wrote:
> > Ashraf Eassa (aeassa.delete@this.gmail.com) on May 15, 2013 11:59 am wrote:
> > > Hi everybody,
> > >
> > > I've been lurking for years, but the time has come when I would really love to pick the brains of
> > > the experts we have here. From my understanding, Atom is a much narrower design than Krait, Cortex
> > > A15 and others, and yet, in many benchmarks the older Saltwell core holds its own against even Krait
> > > in both FPU/INT, and against A15 in Linux integer benchmarks (but it gets decimated in FPU).
> >
> > Which Linux benchmarks do you mean? This does not show a single benchmark where dual Atom can keep up
> > with dual A15. Even Tegra 3 wins 9 out of 11 benchmarks despite its slow single-channel memory system.
>
> You realize there are an awful lot more tests out there that Phoronix doesn't run, right?
>
> Also, why do we even care about Linux? We care about Android, which is rather distinct from Linux.
Android is just a layer on top of Linux. So yes, we do care about Linux performance using GCC on mainstream code. How many tricks ICC uses to get great SPEC results is irrelevant in the Linux/Android world.
> > > So, my question is, how do I think about "Silvermont" competitive position against a fairly
> > > beefy modern ARM design such as the Cortex A15? From a high level perspective, it looks
> > > like on a per-clock basis it should be no contest - A15 is wider and more aggressive.
> > > But Intel is claiming that Silvermont is as fast as A15 on a per-clock basis.
> >
> > "Intel is claiming" - there is your hint... When Atom originally was announced, it was supposed
> > to be 5-6 times faster than ARM cores. However when Atom was finally available in phones, it
> > actually lagged in performance. This is where Atom is today. Is that competitive?
>
> When did Intel claim that Atom would be 5-6x faster than ARM cores? And which
> ARM cores? I'd like to see some proof, because that just sounds crazy.
Here is what Intel claimed at the time. Yep crazy stuff indeed. They compared the fastest Atom against a low frequency ARM11, despite much faster versions being available (IIRC 750MHz), as well as 600/800MHz Cortex-A8. Remember the "only x86 gives the full web browsing experience" slogans?
> > > A couple of questions then:
> > >
> > > 1. How can a narrower design pull this off?
> >
> > It doesn't. Not without trickery anyway - like comparing a highly clocked CPU against a low
> > clocked one,
>
> That's not trickery, that's life. Intel has better process technology and is able to
> hit higher clock speeds.
It's trickery when you use a slow CPU on purpose when much faster CPUs are available. And in terms of frequency ARM has caught up dramatically in recent years. I expect ARM to pull ahead in frequency with Tegra 4i, 20nm A15's and the first 64-bit ARMs.
> Moreover, there are many A15 implementations that are incredibly
> power hungry. This shouldn't surprise anyone, since the A15 started out as a server
> core...but then something happened and ARM tried to shove it into mobiles.
I call BS on that - ARM has never said that A15 is a server-only CPU, it has always been designed for mobile/tablets but with added server extensions. Here is Anands first article on A15, even the title is clear.
> Clock-normalized comparisons are useful as thinking points, but you really need to consider physical design
> and process technology. Power and frequency are intrinsically tied to physical design and process, as is
> area. Certainly there are architectural techniques that can have a big impact (I think the A7 omitting
> a branch predictor is particularly brilliant in that regard), but process has a bigger influence.
If process was the only thing that mattered then how could Calxeda server nodes possibly beat Atom on both performance and power using an old 40nm process?
> >comparing an unreleased CPU against a much older CPU, using different compiler
> > versions or optimizing for specific benchmarks (SunSpider). It's called "benchmarketing"...
>
> > About the only area where Silvermont appears to have an
> > advantage over A15 is a lower L2 latency. Everything
> > else is like you said, smaller buffers, narrower, simpler
> > and less aggressive. Given the memory system advantage
> > I'd expect it to beat A9 by a good margin (although A9R4 might well be competitive). However based on what
> > we know you'd have to be extremely optimistic to believe it can get even close to A15 performance.
>
> I claim BS already. If A15 is so good, why do partial register stalls cause a massive drop in performance for
> Neon? Oh right, maybe it's because someone made a stupid architectural decision they fixed in the A57.
Do you have any evidence for that? Partial register stalls are rare on ARM, I don't believe they happen in common cases, unlike x86.
Wilco
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on May 15, 2013 5:37 pm wrote:
> > Ashraf Eassa (aeassa.delete@this.gmail.com) on May 15, 2013 11:59 am wrote:
> > > Hi everybody,
> > >
> > > I've been lurking for years, but the time has come when I would really love to pick the brains of
> > > the experts we have here. From my understanding, Atom is a much narrower design than Krait, Cortex
> > > A15 and others, and yet, in many benchmarks the older Saltwell core holds its own against even Krait
> > > in both FPU/INT, and against A15 in Linux integer benchmarks (but it gets decimated in FPU).
> >
> > Which Linux benchmarks do you mean? This does not show a single benchmark where dual Atom can keep up
> > with dual A15. Even Tegra 3 wins 9 out of 11 benchmarks despite its slow single-channel memory system.
>
> You realize there are an awful lot more tests out there that Phoronix doesn't run, right?
>
> Also, why do we even care about Linux? We care about Android, which is rather distinct from Linux.
Android is just a layer on top of Linux. So yes, we do care about Linux performance using GCC on mainstream code. How many tricks ICC uses to get great SPEC results is irrelevant in the Linux/Android world.
> > > So, my question is, how do I think about "Silvermont" competitive position against a fairly
> > > beefy modern ARM design such as the Cortex A15? From a high level perspective, it looks
> > > like on a per-clock basis it should be no contest - A15 is wider and more aggressive.
> > > But Intel is claiming that Silvermont is as fast as A15 on a per-clock basis.
> >
> > "Intel is claiming" - there is your hint... When Atom originally was announced, it was supposed
> > to be 5-6 times faster than ARM cores. However when Atom was finally available in phones, it
> > actually lagged in performance. This is where Atom is today. Is that competitive?
>
> When did Intel claim that Atom would be 5-6x faster than ARM cores? And which
> ARM cores? I'd like to see some proof, because that just sounds crazy.
Here is what Intel claimed at the time. Yep crazy stuff indeed. They compared the fastest Atom against a low frequency ARM11, despite much faster versions being available (IIRC 750MHz), as well as 600/800MHz Cortex-A8. Remember the "only x86 gives the full web browsing experience" slogans?
> > > A couple of questions then:
> > >
> > > 1. How can a narrower design pull this off?
> >
> > It doesn't. Not without trickery anyway - like comparing a highly clocked CPU against a low
> > clocked one,
>
> That's not trickery, that's life. Intel has better process technology and is able to
> hit higher clock speeds.
It's trickery when you use a slow CPU on purpose when much faster CPUs are available. And in terms of frequency ARM has caught up dramatically in recent years. I expect ARM to pull ahead in frequency with Tegra 4i, 20nm A15's and the first 64-bit ARMs.
> Moreover, there are many A15 implementations that are incredibly
> power hungry. This shouldn't surprise anyone, since the A15 started out as a server
> core...but then something happened and ARM tried to shove it into mobiles.
I call BS on that - ARM has never said that A15 is a server-only CPU, it has always been designed for mobile/tablets but with added server extensions. Here is Anands first article on A15, even the title is clear.
> Clock-normalized comparisons are useful as thinking points, but you really need to consider physical design
> and process technology. Power and frequency are intrinsically tied to physical design and process, as is
> area. Certainly there are architectural techniques that can have a big impact (I think the A7 omitting
> a branch predictor is particularly brilliant in that regard), but process has a bigger influence.
If process was the only thing that mattered then how could Calxeda server nodes possibly beat Atom on both performance and power using an old 40nm process?
> >comparing an unreleased CPU against a much older CPU, using different compiler
> > versions or optimizing for specific benchmarks (SunSpider). It's called "benchmarketing"...
>
> > About the only area where Silvermont appears to have an
> > advantage over A15 is a lower L2 latency. Everything
> > else is like you said, smaller buffers, narrower, simpler
> > and less aggressive. Given the memory system advantage
> > I'd expect it to beat A9 by a good margin (although A9R4 might well be competitive). However based on what
> > we know you'd have to be extremely optimistic to believe it can get even close to A15 performance.
>
> I claim BS already. If A15 is so good, why do partial register stalls cause a massive drop in performance for
> Neon? Oh right, maybe it's because someone made a stupid architectural decision they fixed in the A57.
Do you have any evidence for that? Partial register stalls are rare on ARM, I don't believe they happen in common cases, unlike x86.
Wilco