By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), March 3, 2013 3:06 pm
Room: Moderated Discussions
Andi Kleen (x.delete@this.y.z) on March 3, 2013 1:47 pm wrote:
>
> Hardware parity is actually useful for other things too, like efficient n-state checks:
> http://halobates.de/blog/p/214
>
> Unfortunately x86 only does it for 8 bits, but even that helps.
That's just silly. You're much better off doing something like this:
cmp $-2,reg
and then you can check for negative, zero, carry and positive, all without parity.
(negatve: -3, zero: -2, carry: -1, positive: 0)
Ok, I made that one up so maybe it's broken, but people used tricks like that if you know your set of values is limited to some particular ones. Overflow can be useful too (eg -12[7-8]).
Of course, with modern microarchitectures that combine compare and branch into one uop, the upside is more questionable (possibly smaller I$ footprint?).
Also, one thing to look out for with multiple conditional branches is that sometimes the branch prediction information is fed based on ignoring the low three or four bits of the instruction pointer, so branches next to each other can actually be a problem. That's especially true if you hide them from the compiler and don't let it schedule them. I haven't checked what most x86 microarchitectures do, but the "branch precition ignores low bits" thing was an issue for some alpha cores iirc.
Linus
>
> Hardware parity is actually useful for other things too, like efficient n-state checks:
> http://halobates.de/blog/p/214
>
> Unfortunately x86 only does it for 8 bits, but even that helps.
That's just silly. You're much better off doing something like this:
cmp $-2,reg
and then you can check for negative, zero, carry and positive, all without parity.
(negatve: -3, zero: -2, carry: -1, positive: 0)
Ok, I made that one up so maybe it's broken, but people used tricks like that if you know your set of values is limited to some particular ones. Overflow can be useful too (eg -12[7-8]).
Of course, with modern microarchitectures that combine compare and branch into one uop, the upside is more questionable (possibly smaller I$ footprint?).
Also, one thing to look out for with multiple conditional branches is that sometimes the branch prediction information is fed based on ignoring the low three or four bits of the instruction pointer, so branches next to each other can actually be a problem. That's especially true if you hide them from the compiler and don't let it schedule them. I haven't checked what most x86 microarchitectures do, but the "branch precition ignores low bits" thing was an issue for some alpha cores iirc.
Linus