By: bakaneko (nyan.delete@this.hyan.wan), July 18, 2013 2:05 pm
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 17, 2013 3:46 pm wrote:
> bakaneko (nyan.delete@this.hyan.wan) on July 16, 2013 4:55 am wrote:
> > ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on July 16, 2013 3:16 am wrote:
> > > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 15, 2013 8:45 pm wrote:
> > > > bakaneko (nyan.delete@this.hyan.wan) on July 15, 2013 7:47 pm wrote:
> > > > >
> > > > > As someone who works for ARM as compiler writer,
> > > > > why don't you tell us in more detail how Intel
> > > > > cheated?
> > > >
> > > > Exophase's post to anandtech was quoted here earlier, I think. It has the relevant details:
> > > >
> > > > http://forums.anandtech.com/showthread.php?t=2330027
> > > >
> > > > and quite frankly, while optimizing multiple bit operations into a word is a very
> > > > valid optimization, the code icc generates there seems a fair bit past that.
> > > >
> > > > Sure, it could in theory happen with a really smart compiler and lots of generic optimizations.
> > > > In practice? It really smells like the compiler actively targeting a very particular code-sequence.
> > > > IOW, compiler cheating. The timing that Exophase points out makes it look worse.
> > > >
> > > > And Wilco is right that it smells pretty bad when AnTuTu seems to be so close to
> > > > intel, and seem to have bent over backwards using recent versions of icc etc.
> > > >
> > > > It's all "explainable". But it doesn't pass the smell test.
> > > >
> > > > Linus
> > >
> > > The optimization is clearly doable by a machine. In my opinion this means there is no reason to criticize
> > > the compiler or the compiler team for adding the optimization to the compiler. The blame should go towards
> > > people who published the benchmark results without making it clear that the results are a mixture of
> > > raw CPU performance and compiler optimizations in the context of a particular benchmark.
> > >
> > > It isn't compiler cheating. It is misattribution of benchmark results. The improved benchmark numbers
> > > should have been attributed to both the CPU and the compiler, rather than just to the CPU alone.
> > >
> > > I think it would be best for benchmarks to take into account
> > > the number of executed instructions. Seeing the
> > > numbers of executed instructions when comparing benchmarks
> > > would make it easier to distinguish CPU performance
> > > from compiler performance (and from other stuff). It would
> > > be nice for this to become the standard way of reporting
> > > benchmarks by benchmarking sites. It would be interesting to see similar numbers in GPU benchmarks.
> >
> > The question is how this benchmark got ever
> > published like this. Some people assume it
> > was done by malice.
> >
> > If it was only about the benchmark, then the
> > benchmark is clearly at fault because it uses
> >
> > typedef unsigned long farulong;
> >
> > but never prefixes it with volatile (I think
> > this was enough in this case) in ToggleBitRun,
> > so the compiler can run wild with merging
> > memory accesses.
>
> Remember this is 20 year old code - compilers were pretty dumb, volatile was new and practically
> unused/unknown. ByteMark never gained much popularity, it disappeared as quickly as it
> emerged - I never imagined that someone would use it as a mobile benchmark. By far the
> best solution is to never use this code, especially not as a RAM benchmark...
True, but IMHO, the work is done, so it's a
good benchmark while working on a compiler (no
puns intended).
And there are real RAM benchmarks/diagnosis
tools available with source code, so whatever.
> bakaneko (nyan.delete@this.hyan.wan) on July 16, 2013 4:55 am wrote:
> > ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on July 16, 2013 3:16 am wrote:
> > > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 15, 2013 8:45 pm wrote:
> > > > bakaneko (nyan.delete@this.hyan.wan) on July 15, 2013 7:47 pm wrote:
> > > > >
> > > > > As someone who works for ARM as compiler writer,
> > > > > why don't you tell us in more detail how Intel
> > > > > cheated?
> > > >
> > > > Exophase's post to anandtech was quoted here earlier, I think. It has the relevant details:
> > > >
> > > > http://forums.anandtech.com/showthread.php?t=2330027
> > > >
> > > > and quite frankly, while optimizing multiple bit operations into a word is a very
> > > > valid optimization, the code icc generates there seems a fair bit past that.
> > > >
> > > > Sure, it could in theory happen with a really smart compiler and lots of generic optimizations.
> > > > In practice? It really smells like the compiler actively targeting a very particular code-sequence.
> > > > IOW, compiler cheating. The timing that Exophase points out makes it look worse.
> > > >
> > > > And Wilco is right that it smells pretty bad when AnTuTu seems to be so close to
> > > > intel, and seem to have bent over backwards using recent versions of icc etc.
> > > >
> > > > It's all "explainable". But it doesn't pass the smell test.
> > > >
> > > > Linus
> > >
> > > The optimization is clearly doable by a machine. In my opinion this means there is no reason to criticize
> > > the compiler or the compiler team for adding the optimization to the compiler. The blame should go towards
> > > people who published the benchmark results without making it clear that the results are a mixture of
> > > raw CPU performance and compiler optimizations in the context of a particular benchmark.
> > >
> > > It isn't compiler cheating. It is misattribution of benchmark results. The improved benchmark numbers
> > > should have been attributed to both the CPU and the compiler, rather than just to the CPU alone.
> > >
> > > I think it would be best for benchmarks to take into account
> > > the number of executed instructions. Seeing the
> > > numbers of executed instructions when comparing benchmarks
> > > would make it easier to distinguish CPU performance
> > > from compiler performance (and from other stuff). It would
> > > be nice for this to become the standard way of reporting
> > > benchmarks by benchmarking sites. It would be interesting to see similar numbers in GPU benchmarks.
> >
> > The question is how this benchmark got ever
> > published like this. Some people assume it
> > was done by malice.
> >
> > If it was only about the benchmark, then the
> > benchmark is clearly at fault because it uses
> >
> > typedef unsigned long farulong;
> >
> > but never prefixes it with volatile (I think
> > this was enough in this case) in ToggleBitRun,
> > so the compiler can run wild with merging
> > memory accesses.
>
> Remember this is 20 year old code - compilers were pretty dumb, volatile was new and practically
> unused/unknown. ByteMark never gained much popularity, it disappeared as quickly as it
> emerged - I never imagined that someone would use it as a mobile benchmark. By far the
> best solution is to never use this code, especially not as a RAM benchmark...
True, but IMHO, the work is done, so it's a
good benchmark while working on a compiler (no
puns intended).
And there are real RAM benchmarks/diagnosis
tools available with source code, so whatever.