By: Charlie Burnes (firstname.lastname@example.org), May 18, 2022 2:55 pm
Room: Moderated Discussions
Intel’s AVX-512 has 32 512-bit registers compared to the 16 256-bit registers in AVX2. Suppose I need code to run on both x86 CPUs with and without AVX-512. Should I try to write the code in a way that needs only eight 512-bit registers so that Highway can use two 256-bit registers and two AVX2 instructions to emulate an AVX-512 instruction? This would minimize loads and stores on AVX2 which has only 16 256-bit registers.