By: -.- (blarg.delete@this.mailinator.com), May 20, 2022 4:04 am
Room: Moderated Discussions
Jan Wassenberg (jan.wassenberg.delete@this.gmail.com) on May 19, 2022 6:45 am wrote:
> We have the same problem in reverse for emulating compressstore on SVE - that's
> 4 instructions (cnt, whilelt, compact, st1), but seems to be quite fast.
Well compressstore is actually two instructions disguised as one. If you don't care about writing extra zeroes at the end, the CNT+WHILELT isn't necessary either.
I mean, you can always find some instruction that's harder to emulate on the other ISA. But the point here is that WHILELT is supposed to be fairly important to a lot of SVE code, and helps with reducing unforeseen costs with auto-vectorization.
> We have the same problem in reverse for emulating compressstore on SVE - that's
> 4 instructions (cnt, whilelt, compact, st1), but seems to be quite fast.
Well compressstore is actually two instructions disguised as one. If you don't care about writing extra zeroes at the end, the CNT+WHILELT isn't necessary either.
I mean, you can always find some instruction that's harder to emulate on the other ISA. But the point here is that WHILELT is supposed to be fairly important to a lot of SVE code, and helps with reducing unforeseen costs with auto-vectorization.