By: --- (---.delete@this.redheron.com), May 16, 2022 10:22 pm
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on May 15, 2022 10:50 am wrote:
> Michael S (already5chosen.delete@this.yahoo.com) on May 14, 2022 12:27 pm wrote:
> > Simon Farnsworth (simon.delete@this.farnz.org.uk) on May 14, 2022 5:20 am wrote:
> > > SVE2's big advantage over NEON is not wider vectors, but all the compiler-convenience features
> > > it has that allow a compiler to be more aggressive about auto-vectorization. For people who
> > > are hand-tuning codes for peak performance, SVE2 at 128 bit and NEON are about the same, but
> > > SVE2 pulls ahead handily (due to the FFR register and associated instructions) when you're
> > > writing "serial" code and relying on the compiler doing something sensible to it.
> > >
> > > You won't get the same performance this way as you would tuning
> > > your code for 128 bit vectors, but it's still a win.
> > >
> >
> > After looking at code, generated by LLVM autovectorizer last year, I am more that a somewhat
> > doubtful. To say that a year ago they were bad would be an undeserving compliment.
>
>
> From what I understand from someone who writes this type of code (he's x86 focused so AVX not SVE or NEON) he
> has to format his code just so to allow it to be properly autovectorized. He learned by trial and error, checking
> assembly output to figure out what the compiler expects and write his code to match. When the compiler is updated,
> he has to recheck to verify his carefully crafted code sequences still produce the desired effect.
>
> Sounds like it is better than writing directly in assembly, but not by much. And I doubt
> most programmers go to such lengths. Most probably write code that could be auto vectorized
> but is not, and they don't even know there is a lot of performance left on the table.
This may be true, but the argument:
- existing vector ISAs are a poor match for compilers
THEREFORE
- a new vector ISA, explicitly designed with all this accumulated experience in mind, and by people who are well aware of why the compilers have difficulty, will be just as bad
seems rather strange...
> Michael S (already5chosen.delete@this.yahoo.com) on May 14, 2022 12:27 pm wrote:
> > Simon Farnsworth (simon.delete@this.farnz.org.uk) on May 14, 2022 5:20 am wrote:
> > > SVE2's big advantage over NEON is not wider vectors, but all the compiler-convenience features
> > > it has that allow a compiler to be more aggressive about auto-vectorization. For people who
> > > are hand-tuning codes for peak performance, SVE2 at 128 bit and NEON are about the same, but
> > > SVE2 pulls ahead handily (due to the FFR register and associated instructions) when you're
> > > writing "serial" code and relying on the compiler doing something sensible to it.
> > >
> > > You won't get the same performance this way as you would tuning
> > > your code for 128 bit vectors, but it's still a win.
> > >
> >
> > After looking at code, generated by LLVM autovectorizer last year, I am more that a somewhat
> > doubtful. To say that a year ago they were bad would be an undeserving compliment.
>
>
> From what I understand from someone who writes this type of code (he's x86 focused so AVX not SVE or NEON) he
> has to format his code just so to allow it to be properly autovectorized. He learned by trial and error, checking
> assembly output to figure out what the compiler expects and write his code to match. When the compiler is updated,
> he has to recheck to verify his carefully crafted code sequences still produce the desired effect.
>
> Sounds like it is better than writing directly in assembly, but not by much. And I doubt
> most programmers go to such lengths. Most probably write code that could be auto vectorized
> but is not, and they don't even know there is a lot of performance left on the table.
This may be true, but the argument:
- existing vector ISAs are a poor match for compilers
THEREFORE
- a new vector ISA, explicitly designed with all this accumulated experience in mind, and by people who are well aware of why the compilers have difficulty, will be just as bad
seems rather strange...