By: Jörn Engel (joern.delete@this.purestorage.com), May 16, 2022 12:52 pm
Room: Moderated Discussions
Simon Farnsworth (simon.delete@this.farnz.org.uk) on May 16, 2022 4:13 am wrote:
>
> In particular, compilers are basically forced to vectorize code as "prologue to get to vector
> alignment", "wide body", "epilogue to handle tail shorter than a vector", because auto-vectorization
> can't make assumptions about input data. Humans can rewrite the code so that the prologue
> and epilogue aren't needed, or are handled by the caller when needed.
Sometimes, yes. But there are many examples where 80% of the human-written code is dealing with head/tail. Glibc memcpy has a main loop of 22 lines (for the version I'm looking at, there are dozens more). The entire file has 3161 lines and basically implements memcpy and memmove. A lot of those extra 3139 lines are precisely dealing with the head/tail surrounding the main copy loop.
A vector instruction set that reduces pain of head/tail handling is a pretty big deal for human-written code as well.
> Basically, I think that most people who manually vectorize code
> are cleverer than the auto-vectorization pass in compilers :-)
A lot of us are cleverer than the average banana as well. :)
>
> In particular, compilers are basically forced to vectorize code as "prologue to get to vector
> alignment", "wide body", "epilogue to handle tail shorter than a vector", because auto-vectorization
> can't make assumptions about input data. Humans can rewrite the code so that the prologue
> and epilogue aren't needed, or are handled by the caller when needed.
Sometimes, yes. But there are many examples where 80% of the human-written code is dealing with head/tail. Glibc memcpy has a main loop of 22 lines (for the version I'm looking at, there are dozens more). The entire file has 3161 lines and basically implements memcpy and memmove. A lot of those extra 3139 lines are precisely dealing with the head/tail surrounding the main copy loop.
A vector instruction set that reduces pain of head/tail handling is a pretty big deal for human-written code as well.
> Basically, I think that most people who manually vectorize code
> are cleverer than the auto-vectorization pass in compilers :-)
A lot of us are cleverer than the average banana as well. :)