By: Maynard Handley (name99.delete@this.name99.org), July 7, 2015 5:25 pm
Room: Moderated Discussions
Exophase (exophase.delete@this.gmail.com) on July 7, 2015 1:26 pm wrote:
> Maynard Handley (name99.delete@this.name99.org) on July 7, 2015 12:00 pm wrote:
> > Maybe the problem is when I say cmov/csel I am thinking of the (IMHO) obvious use cases.
> > max/min, abs, sgn, and the sorts of very similar functions I constantly dealt with when
> > writing codecs (eg parse one bit then, if (bit){motionVector=-motionVector})
> > All of these strike me as PRECISELY the point of cmov/csel.
> > Perhaps it's my experience in this field where one CONSTANTLY
> > has these sorts of one instruction branch-overs
> > --- for clamping values, for non-linear edge smoothing, etc --- that makes me appreciate their value;
> > and perhaps most people just don't encounter this sort of code in the code they write?
>
> A lot of those operations are already commonly supported directly in modern SIMD architectures. Or can be
> synthesized in a similar or smaller number of instructions compared to a solution with cmov or csel. For example,
> on ARM NEON if (bit){motionVector=-motionVector} can be computed as (on a vector of 32-bit ints):
>
> vtst.u32 mask, bit, bit
> veor.u32 motionVector, motionVector, mask
> vsub.u32 motionVector, motionVector, mask
>
> Where the equivalent with conditional select would be something like:
>
> vtst.u32 mask, bit, bit
> vneg.s32 motionVectorNeg, motionVector
> vbit.u32 motionVector, motionVectorNeg, mask
>
Which doesn't help if you are not working in the vector registers...
I thought it was obvious that this discussion was in the context of "traditional" (int, and at a push, FP, registers).
Of COURSE many of them can be synthesized in other ways. I had a header full of weird bit trick macros to achieve the same ends in three or four ops. But if I can drop that to one op...
> Maynard Handley (name99.delete@this.name99.org) on July 7, 2015 12:00 pm wrote:
> > Maybe the problem is when I say cmov/csel I am thinking of the (IMHO) obvious use cases.
> > max/min, abs, sgn, and the sorts of very similar functions I constantly dealt with when
> > writing codecs (eg parse one bit then, if (bit){motionVector=-motionVector})
> > All of these strike me as PRECISELY the point of cmov/csel.
> > Perhaps it's my experience in this field where one CONSTANTLY
> > has these sorts of one instruction branch-overs
> > --- for clamping values, for non-linear edge smoothing, etc --- that makes me appreciate their value;
> > and perhaps most people just don't encounter this sort of code in the code they write?
>
> A lot of those operations are already commonly supported directly in modern SIMD architectures. Or can be
> synthesized in a similar or smaller number of instructions compared to a solution with cmov or csel. For example,
> on ARM NEON if (bit){motionVector=-motionVector} can be computed as (on a vector of 32-bit ints):
>
> vtst.u32 mask, bit, bit
> veor.u32 motionVector, motionVector, mask
> vsub.u32 motionVector, motionVector, mask
>
> Where the equivalent with conditional select would be something like:
>
> vtst.u32 mask, bit, bit
> vneg.s32 motionVectorNeg, motionVector
> vbit.u32 motionVector, motionVectorNeg, mask
>
Which doesn't help if you are not working in the vector registers...
I thought it was obvious that this discussion was in the context of "traditional" (int, and at a push, FP, registers).
Of COURSE many of them can be synthesized in other ways. I had a header full of weird bit trick macros to achieve the same ends in three or four ops. But if I can drop that to one op...