By: Jörn Engel (joern.delete@this.purestorage.com), May 24, 2022 9:41 pm
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on May 24, 2022 2:39 pm wrote:
>
> I have not tried this on more recent Intel CPUs, but in a measurement on Skylake Server CPUs
> (with 2 512-bit FMA units) done a few years ago, the ratio between the energies needed to
> compute some LINPACK benchmark in AVX-512 and in AVX2 (i.e. with 256-bit FMA/LD/ST) modes
> was around 5/6, so a little more than your maximum estimation, but not much more.
Compute time ratio should be 3:4, assuming the FMA is the only relevant bottleneck.
Not sure if I trust my math, but the implication seems to be that the FMA consumes 47% of total energy with AVX512 and 40% of total energy with AVX2. If "everything else" includes DRAM, disk, power supplies, etc., that result doesn't seem outrageous.
>
> I have not tried this on more recent Intel CPUs, but in a measurement on Skylake Server CPUs
> (with 2 512-bit FMA units) done a few years ago, the ratio between the energies needed to
> compute some LINPACK benchmark in AVX-512 and in AVX2 (i.e. with 256-bit FMA/LD/ST) modes
> was around 5/6, so a little more than your maximum estimation, but not much more.
Compute time ratio should be 3:4, assuming the FMA is the only relevant bottleneck.
Not sure if I trust my math, but the implication seems to be that the FMA consumes 47% of total energy with AVX512 and 40% of total energy with AVX2. If "everything else" includes DRAM, disk, power supplies, etc., that result doesn't seem outrageous.