By: Klimax (danklima.delete@this.gmail.com), July 18, 2013 11:30 am
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 18, 2013 4:07 am wrote:
> anon (anon.delete@this.anon.com) on July 18, 2013 2:10 am wrote:
> > Klimax (danklima.delete@this.gmail.com) on July 18, 2013 1:41 am wrote:
> > > none (none.delete@this.none.com) on July 17, 2013 11:35 pm wrote:
> > > > Klimax (danklima.delete@this.gmail.com) on July 17, 2013 11:29 pm wrote:
> > > > > Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 17, 2013 3:33 pm wrote:
> > > > > > bakaneko (nyan.delete@this.hyan.wan) on July 16, 2013 5:50 am wrote:
> > > > > > > Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 16, 2013 3:47 am wrote:
> > > > > > > > bakaneko (nyan.delete@this.hyan.wan) on July 16, 2013 2:42 am wrote:
> > > > > > > > > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 15, 2013 8:45 pm wrote:
> > > > > > > > > > bakaneko (nyan.delete@this.hyan.wan) on July 15, 2013 7:47 pm wrote:
> > > > > > > > > > >
> > > > > > > > > > > As someone who works for ARM as compiler writer,
> > > > > > > > > > > why don't you tell us in more detail how Intel
> > > > > > > > > > > cheated?
> > > > > > > > > >
> > > > > > > > > > Exophase's post to anandtech was quoted here earlier, I think. It has the relevant details:
> > > > > > > > > >
> > > > > > > > > > http://forums.anandtech.com/showthread.php?t=2330027
> > > > > > > > > >
> > > > > > > > > > and quite frankly, while optimizing multiple bit operations into a word is a very
> > > > > > > > > > valid optimization, the code icc generates there seems a fair bit past that.
> > > > > > > > > >
> > > > > > > > > > Sure, it could in theory happen with a really smart compiler and lots of generic optimizations.
> > > > > > > > > > In practice? It really smells like the compiler actively targeting a very particular code-sequence.
> > > > > > > > > > IOW, compiler cheating. The timing that Exophase points out makes it look worse.
> > > > > > > > > >
> > > > > > > > > > And Wilco is right that it smells pretty bad when AnTuTu seems to be so close to
> > > > > > > > > > intel, and seem to have bent over backwards using recent versions of icc etc.
> > > > > > > > > >
> > > > > > > > > > It's all "explainable". But it doesn't pass the smell test.
> > > > > > > > >
> > > > > > > > > It's also all besides the point. I would expect a better
> > > > > > > > > explanation from someone who claims to know their shit about
> > > > > > > > > compilers than a logical fallacy ("Intel improved their
> > > > > > > > > compiler recently", "Intel is cheating because it hits one
> > > > > > > > > function in a certain benchmark"). Instead he nags like his
> > > > > > > > > marriage has gone bad.
> > > > > > > >
> > > > > > > > If you had actually read the whole thread including the links to various articles I posted
> > > > > > > > then you would have found the detailed explanation that makes it obvious that Intel has been
> > > > > > > > cheating AnTuTu. Why should I explain all the details again in every post I make?
> > > > > > >
> > > > > > > I read the whole thread, but for some reason don't remember
> > > > > > > anything noteworthy.
> > > > > > >
> > > > > > > > > And no, the ICC results aren't that far fetched. Intel
> > > > > > > > > actually recommends -O3 -xSSSE3_ATOM* with the NDK. Which
> > > > > > > > > could also explain why Exophase saw optimizations which
> > > > > > > > > would be counterproductive for bigger programs. (If the
> > > > > > > > > flags have similar meaning to gcc, where inlining and loop
> > > > > > > > > unrolling have similar problems.)
> > > > > > > >
> > > > > > > > Given the ICC results already dropped 20% after minor changes in AnTuTu it seems this
> > > > > > > > optimization is no longer effective. And that proves it was very specific to the actual
> > > > > > > > source code rather than a generic optimization that any compiler does.
> > > > > > >
> > > > > > > How is setting/clearing a range of bits not a useful
> > > > > > > optimization?
> > > > > >
> > > > > > Nobody sets multiple adjacent bits in memory one at a time. Even if one did so it would
> > > > > > typically be a few bits, certainly not hundreds or thousands. Having seen and written
> > > > > > various implementations of bit-sets I know how the typical ones look like.
> > > > > >
> > > > > > > And how does this - just because this is not an optimization
> > > > > > > any compiler does (Or at least gcc with -Os) - suddenly make
> > > > > > > this an benchmark busting trick? Oh, yes because the time
> > > > > > > frame fits.
> > > > > >
> > > > > > Nobody writes code like this, so no compiler implements this optimization. So yes, ICC gaining
> > > > > > this optimization just before AnTuTu switched to ICC is proof that the optimization was added
> > > > > > to break AnTuTu. If ICC had implemented this particular optimization for many years then it
> > > > > > would be a different matter of course (although that still doesn't explain exactly why AnTuTu
> > > > > > switched to a non-standard compiler with options tuned for AnTuTu just for x86).
> > > > > So far no proof yet. Do you have any? Any evidence at all?
> > > >
> > > > Show us code that's from real code that benefits from this optimization.
> > >
> > > No one at hand, but Flash files have a lot of variable bitfields. Also same kind of code
> > > prompting std::vector... (So why such unrealistic bit manipulation in a benchmark?)
>
> Show some actual code then that does a similar thing as the benchmark and is also optimized by ICC.
First, learn to reply properly, because reply is to anon, but content is mine. (Until last part)
Second, I had one in progress, but it's unfinished. (Decoder for Flash - see specs. by Adobe for SWF files; it's insane format)
It wasn't too complex bit processing code for general case, which would make nice test case.
(bit-copy from byte array to preallocated variable with some cases being handled by optimized code)
> > > Anyway, claimants that Intel cheats MUST prove it, no shifting
> > > of burden of proof. So any evidence of cheating yet?
> You just don't want to accept the hard facts. The AnTuTu scores for Atom have already dropped significantly
> and will drop even further in the next version. So even AnTuTu accepts it was cheating.
The only fact is broken benchmark. Rest is speculation and so far baseless accusations.
Where is the bloody evidence. Also you are not neutral party as former ARM employee. (Should have prefix for all your posts.)
So far all you showed is some unproven baseless speculation and evidence-less posts.
> anon (anon.delete@this.anon.com) on July 18, 2013 2:10 am wrote:
> > Klimax (danklima.delete@this.gmail.com) on July 18, 2013 1:41 am wrote:
> > > none (none.delete@this.none.com) on July 17, 2013 11:35 pm wrote:
> > > > Klimax (danklima.delete@this.gmail.com) on July 17, 2013 11:29 pm wrote:
> > > > > Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 17, 2013 3:33 pm wrote:
> > > > > > bakaneko (nyan.delete@this.hyan.wan) on July 16, 2013 5:50 am wrote:
> > > > > > > Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 16, 2013 3:47 am wrote:
> > > > > > > > bakaneko (nyan.delete@this.hyan.wan) on July 16, 2013 2:42 am wrote:
> > > > > > > > > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 15, 2013 8:45 pm wrote:
> > > > > > > > > > bakaneko (nyan.delete@this.hyan.wan) on July 15, 2013 7:47 pm wrote:
> > > > > > > > > > >
> > > > > > > > > > > As someone who works for ARM as compiler writer,
> > > > > > > > > > > why don't you tell us in more detail how Intel
> > > > > > > > > > > cheated?
> > > > > > > > > >
> > > > > > > > > > Exophase's post to anandtech was quoted here earlier, I think. It has the relevant details:
> > > > > > > > > >
> > > > > > > > > > http://forums.anandtech.com/showthread.php?t=2330027
> > > > > > > > > >
> > > > > > > > > > and quite frankly, while optimizing multiple bit operations into a word is a very
> > > > > > > > > > valid optimization, the code icc generates there seems a fair bit past that.
> > > > > > > > > >
> > > > > > > > > > Sure, it could in theory happen with a really smart compiler and lots of generic optimizations.
> > > > > > > > > > In practice? It really smells like the compiler actively targeting a very particular code-sequence.
> > > > > > > > > > IOW, compiler cheating. The timing that Exophase points out makes it look worse.
> > > > > > > > > >
> > > > > > > > > > And Wilco is right that it smells pretty bad when AnTuTu seems to be so close to
> > > > > > > > > > intel, and seem to have bent over backwards using recent versions of icc etc.
> > > > > > > > > >
> > > > > > > > > > It's all "explainable". But it doesn't pass the smell test.
> > > > > > > > >
> > > > > > > > > It's also all besides the point. I would expect a better
> > > > > > > > > explanation from someone who claims to know their shit about
> > > > > > > > > compilers than a logical fallacy ("Intel improved their
> > > > > > > > > compiler recently", "Intel is cheating because it hits one
> > > > > > > > > function in a certain benchmark"). Instead he nags like his
> > > > > > > > > marriage has gone bad.
> > > > > > > >
> > > > > > > > If you had actually read the whole thread including the links to various articles I posted
> > > > > > > > then you would have found the detailed explanation that makes it obvious that Intel has been
> > > > > > > > cheating AnTuTu. Why should I explain all the details again in every post I make?
> > > > > > >
> > > > > > > I read the whole thread, but for some reason don't remember
> > > > > > > anything noteworthy.
> > > > > > >
> > > > > > > > > And no, the ICC results aren't that far fetched. Intel
> > > > > > > > > actually recommends -O3 -xSSSE3_ATOM* with the NDK. Which
> > > > > > > > > could also explain why Exophase saw optimizations which
> > > > > > > > > would be counterproductive for bigger programs. (If the
> > > > > > > > > flags have similar meaning to gcc, where inlining and loop
> > > > > > > > > unrolling have similar problems.)
> > > > > > > >
> > > > > > > > Given the ICC results already dropped 20% after minor changes in AnTuTu it seems this
> > > > > > > > optimization is no longer effective. And that proves it was very specific to the actual
> > > > > > > > source code rather than a generic optimization that any compiler does.
> > > > > > >
> > > > > > > How is setting/clearing a range of bits not a useful
> > > > > > > optimization?
> > > > > >
> > > > > > Nobody sets multiple adjacent bits in memory one at a time. Even if one did so it would
> > > > > > typically be a few bits, certainly not hundreds or thousands. Having seen and written
> > > > > > various implementations of bit-sets I know how the typical ones look like.
> > > > > >
> > > > > > > And how does this - just because this is not an optimization
> > > > > > > any compiler does (Or at least gcc with -Os) - suddenly make
> > > > > > > this an benchmark busting trick? Oh, yes because the time
> > > > > > > frame fits.
> > > > > >
> > > > > > Nobody writes code like this, so no compiler implements this optimization. So yes, ICC gaining
> > > > > > this optimization just before AnTuTu switched to ICC is proof that the optimization was added
> > > > > > to break AnTuTu. If ICC had implemented this particular optimization for many years then it
> > > > > > would be a different matter of course (although that still doesn't explain exactly why AnTuTu
> > > > > > switched to a non-standard compiler with options tuned for AnTuTu just for x86).
> > > > > So far no proof yet. Do you have any? Any evidence at all?
> > > >
> > > > Show us code that's from real code that benefits from this optimization.
> > >
> > > No one at hand, but Flash files have a lot of variable bitfields. Also same kind of code
> > > prompting std::vector... (So why such unrealistic bit manipulation in a benchmark?)
>
> Show some actual code then that does a similar thing as the benchmark and is also optimized by ICC.
First, learn to reply properly, because reply is to anon, but content is mine. (Until last part)
Second, I had one in progress, but it's unfinished. (Decoder for Flash - see specs. by Adobe for SWF files; it's insane format)
It wasn't too complex bit processing code for general case, which would make nice test case.
(bit-copy from byte array to preallocated variable with some cases being handled by optimized code)
> > > Anyway, claimants that Intel cheats MUST prove it, no shifting
> > > of burden of proof. So any evidence of cheating yet?
> You just don't want to accept the hard facts. The AnTuTu scores for Atom have already dropped significantly
> and will drop even further in the next version. So even AnTuTu accepts it was cheating.
The only fact is broken benchmark. Rest is speculation and so far baseless accusations.
Where is the bloody evidence. Also you are not neutral party as former ARM employee. (Should have prefix for all your posts.)
So far all you showed is some unproven baseless speculation and evidence-less posts.