By: Adrian (a.delete@this.acm.org), May 29, 2022 4:48 am
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on May 29, 2022 4:33 am wrote:
> The starting point is having a BLAS library that includes optimized variants for the various ISA
> options, e.g. scalar, 128-bit SSE2, 128-bit AVX, 256-bit AVX, 256-bit AVX-512, 512-bit AVX-512.
>
> There are many such BLAS libraries, both open-source and proprietary.
To be more clear, most BLAS libraries do not support so many ISA variants, as provided.
Nevertheless, all of them support as targets all the Intel CPU generations, so they provide at least 4 ISA variants: 128-bit SSE2, 256-bit AVX without FMA, 256-bit AVX with FMA and 512-bit AVX-512.
E.g. with OpenBLAS, you compile it 4 times, with:
make TARGET=NEHALEM
make TARGET=SANDYBRIDGE
make TARGET=HASWELL
make TARGET=SKYLAKEX
With a little effort, any BLAS library can be modified to also provide a scalar variant and variants with reduced width of AVX or AVX-512.
> The starting point is having a BLAS library that includes optimized variants for the various ISA
> options, e.g. scalar, 128-bit SSE2, 128-bit AVX, 256-bit AVX, 256-bit AVX-512, 512-bit AVX-512.
>
> There are many such BLAS libraries, both open-source and proprietary.
To be more clear, most BLAS libraries do not support so many ISA variants, as provided.
Nevertheless, all of them support as targets all the Intel CPU generations, so they provide at least 4 ISA variants: 128-bit SSE2, 256-bit AVX without FMA, 256-bit AVX with FMA and 512-bit AVX-512.
E.g. with OpenBLAS, you compile it 4 times, with:
make TARGET=NEHALEM
make TARGET=SANDYBRIDGE
make TARGET=HASWELL
make TARGET=SKYLAKEX
With a little effort, any BLAS library can be modified to also provide a scalar variant and variants with reduced width of AVX or AVX-512.