By: Jan Wassenberg (jan.wassenberg.delete@this.gmail.com), May 24, 2022 4:50 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on May 24, 2022 3:48 am wrote:
> Did you try to measure efficiency gain yourself?
> For example, AVX-512 vs AVX2 on your own JPEG-XL decode on widely available Tiger Lake or Rocket lake CPU?
That would be very interesting, I don't have the equipment but would be happy to work with someone who does.
One related result: slide 11 of https://pdfs.semanticscholar.org/f464/74f6ae2dde68d6ccaf9f537b5277b99a466c.pdf.
> > Unfortunately not that many developers understand yet that SIMD/vectors are widely useful, not just in ML/cryptography/HPC/image processing niches.
> I am one of those who don't understand, at least as long as HPC is considered widely.
> And I consider myself SIMD fan, rather than denier.
:) How about C++ STL functions, can those be considered widely useful? OK, autovectorization will only manage about half of this list (and only for certain types): http://0x80.pl/notesen/2021-01-18-autovectorization-gcc-clang.html
A couple more are already implemented in https://github.com/google/highway/tree/master/hwy/contrib/algo - plus sort() in contrib/sort.
Sereja's book has a couple more nontrivial ones: https://en.algorithmica.org/hpc/algorithms/argmin/
> Did you try to measure efficiency gain yourself?
> For example, AVX-512 vs AVX2 on your own JPEG-XL decode on widely available Tiger Lake or Rocket lake CPU?
That would be very interesting, I don't have the equipment but would be happy to work with someone who does.
One related result: slide 11 of https://pdfs.semanticscholar.org/f464/74f6ae2dde68d6ccaf9f537b5277b99a466c.pdf.
> > Unfortunately not that many developers understand yet that SIMD/vectors are widely useful, not just in ML/cryptography/HPC/image processing niches.
> I am one of those who don't understand, at least as long as HPC is considered widely.
> And I consider myself SIMD fan, rather than denier.
:) How about C++ STL functions, can those be considered widely useful? OK, autovectorization will only manage about half of this list (and only for certain types): http://0x80.pl/notesen/2021-01-18-autovectorization-gcc-clang.html
A couple more are already implemented in https://github.com/google/highway/tree/master/hwy/contrib/algo - plus sort() in contrib/sort.
Sereja's book has a couple more nontrivial ones: https://en.algorithmica.org/hpc/algorithms/argmin/