By: Nicolas Capens (nicolas.capens.delete@this.gmail.com), August 14, 2011 12:04 pm
Room: Moderated Discussions
Hi Robert,
Robert Davide Graham (bigrobg@gmail.com) on 8/12/11 wrote:
---------------------------
>From your article, it seems that I can execute 6 instructions per clock (because
>there's 6 units), each of which operates on 4 data, times 0.85 GHz. In contrast,
>the Sandy Bridge CPU core can execute 2 instructions per clock, also 4 data, but
>at three times the clock speed, thus achieving the same overall compute throughput.
>
>Are these numbers correct?
AVX instructions can process 8 single-precision floating-point numbers per cycle. Sandy Bridge has a separate multiplication and addition unit, so it can perform 16 operations per cycle per core. For a quad-core at 3.4 GHz, that's 217.6 GFLOPS.
The HD Graphics 3000 has 12 cores, each capable of 4 multiply-add operations per clock. At 1.35 GHz, this results in 129.6 GFLOPS.
In the future the CPU cores will become even more powerful. Haswell will support AVX2, which includes fused multiply-add instructions, and best of all, gather operations.
Cheers,
Nicolas
Robert Davide Graham (bigrobg@gmail.com) on 8/12/11 wrote:
---------------------------
>From your article, it seems that I can execute 6 instructions per clock (because
>there's 6 units), each of which operates on 4 data, times 0.85 GHz. In contrast,
>the Sandy Bridge CPU core can execute 2 instructions per clock, also 4 data, but
>at three times the clock speed, thus achieving the same overall compute throughput.
>
>Are these numbers correct?
AVX instructions can process 8 single-precision floating-point numbers per cycle. Sandy Bridge has a separate multiplication and addition unit, so it can perform 16 operations per cycle per core. For a quad-core at 3.4 GHz, that's 217.6 GFLOPS.
The HD Graphics 3000 has 12 cores, each capable of 4 multiply-add operations per clock. At 1.35 GHz, this results in 129.6 GFLOPS.
In the future the CPU cores will become even more powerful. Haswell will support AVX2, which includes fused multiply-add instructions, and best of all, gather operations.
Cheers,
Nicolas