By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), February 18, 2013 5:14 pm
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on February 17, 2013 4:08 pm wrote:
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on February 17, 2013 1:47 pm wrote:
> >
> > One condition bit for arithmetic and logical operations is enough,
> > more than 90% of the time you only need a zero/non-zero
> > result.
>
> Not true.
>
> I just checked. At least for Linux, compare-against zero is just
> under 80% of the conditional branches. Not "more than 90%".
That's not at all what I measured. In GCC, there are just 1609 non-compare ALU instructions followed by a branch are for equality. 94% of those are for equality. Just 3 instructions, adds, subs and movs followed by beq/bne account for 80%.
There are 36786 flag setting instructions immediately followed by a branch on any condition. 96% of such instructions are compares. 28280 are compare with zero instructions, ie. 77% of all flag setting instructions are cmp+beq/bne.
Note that compare includes cmp, cmn, tst, and teq on ARM, so this includes single bit and mask tests which can be encoded in a single instruction. Of course I also counted cbz/cbnz as cmp+beq/bne. These results should be comparable with x86 which has similar test instructions.
These statistics are crystal clear: it is just not useful to support ALU instructions that set flags. You can deal with almost all cases using a simple combined compare and branch on equality. But even without, flag setting ALU instructions are in such a tiny minority (4%) that you'd hardly notice it if you never set flags on ALU instructions.
Since you typically see a compare followed by a branch (96% of all flag setting instructions), combining them is really a no-brainer.
> Yes, compare-against zero is the most common by far (and it's often the result
> of a logical 'and' op, not necessarily really "comparing" a value against zero
> - "test" in x86 speak), but the others aren't hugely uncommon either.
>
> So a single bit gets you far, but there are advantages to multiple bits.
Well these numbers support my view that even a single condition bit is already overkill. There are only 100 non-compare ALU instructions which are not for equality - just 0.27% overall, even less if you consider the total number of instructions. You can't argue that that matters...
> > Again, in compiled code you wouldn't notice the difference.
>
> Umm. I was talking about compiled code.
>
> Seriously. Look at what compilers do for smaller case statements.
> They very much generate a single compare and multiple branches.
I didn't look for branches generated for 3-way binary searches but I suspect there aren't many of those either.
Wilco
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on February 17, 2013 1:47 pm wrote:
> >
> > One condition bit for arithmetic and logical operations is enough,
> > more than 90% of the time you only need a zero/non-zero
> > result.
>
> Not true.
>
> I just checked. At least for Linux, compare-against zero is just
> under 80% of the conditional branches. Not "more than 90%".
That's not at all what I measured. In GCC, there are just 1609 non-compare ALU instructions followed by a branch are for equality. 94% of those are for equality. Just 3 instructions, adds, subs and movs followed by beq/bne account for 80%.
There are 36786 flag setting instructions immediately followed by a branch on any condition. 96% of such instructions are compares. 28280 are compare with zero instructions, ie. 77% of all flag setting instructions are cmp+beq/bne.
Note that compare includes cmp, cmn, tst, and teq on ARM, so this includes single bit and mask tests which can be encoded in a single instruction. Of course I also counted cbz/cbnz as cmp+beq/bne. These results should be comparable with x86 which has similar test instructions.
These statistics are crystal clear: it is just not useful to support ALU instructions that set flags. You can deal with almost all cases using a simple combined compare and branch on equality. But even without, flag setting ALU instructions are in such a tiny minority (4%) that you'd hardly notice it if you never set flags on ALU instructions.
Since you typically see a compare followed by a branch (96% of all flag setting instructions), combining them is really a no-brainer.
> Yes, compare-against zero is the most common by far (and it's often the result
> of a logical 'and' op, not necessarily really "comparing" a value against zero
> - "test" in x86 speak), but the others aren't hugely uncommon either.
>
> So a single bit gets you far, but there are advantages to multiple bits.
Well these numbers support my view that even a single condition bit is already overkill. There are only 100 non-compare ALU instructions which are not for equality - just 0.27% overall, even less if you consider the total number of instructions. You can't argue that that matters...
> > Again, in compiled code you wouldn't notice the difference.
>
> Umm. I was talking about compiled code.
>
> Seriously. Look at what compilers do for smaller case statements.
> They very much generate a single compare and multiple branches.
I didn't look for branches generated for 3-way binary searches but I suspect there aren't many of those either.
Wilco