By: Temp (not.delete@this.this.time), February 1, 2011 2:17 pm
Room: Moderated Discussions
Eric Bron (eric.bron@zvisuelREMOVE.com) on 1/20/11 wrote:
---------------------------
>>Biggest advantage of AVX is that it supplies 3operand instruction formats for *all*
>>previous SSE1,2,3,4.1/.2/AES instructions.
>
>from hands on experiments the speedup after recompiling legacy source for AVX-128
>give you less than 5% speedup (vs SSE on the same machine), the code is 10% - 15%
>more compact but you mostly remove register to register moves which were fast, the
>bottlenecks (loads/stores, branch misses, etc.) are exactly the same, so I'm quite
>sure nobody will endure the burden to have an AVX path just for this 5% speedup
Measurements done by C't under Win 7 SP1 seem to confirm this. Using Intel Composer XE 2011 (icc/fort 12.0.127) they compiled the SPEC CPU 2006 suite targeting SSE4.2, and AVX, and ran the benchmarks on a Core i7 2600K.
The base fp score was 43 using AVX and 41 using SSE4.2, for a speedup of 4.9%.
---------------------------
>>Biggest advantage of AVX is that it supplies 3operand instruction formats for *all*
>>previous SSE1,2,3,4.1/.2/AES instructions.
>
>from hands on experiments the speedup after recompiling legacy source for AVX-128
>give you less than 5% speedup (vs SSE on the same machine), the code is 10% - 15%
>more compact but you mostly remove register to register moves which were fast, the
>bottlenecks (loads/stores, branch misses, etc.) are exactly the same, so I'm quite
>sure nobody will endure the burden to have an AVX path just for this 5% speedup
Measurements done by C't under Win 7 SP1 seem to confirm this. Using Intel Composer XE 2011 (icc/fort 12.0.127) they compiled the SPEC CPU 2006 suite targeting SSE4.2, and AVX, and ran the benchmarks on a Core i7 2600K.
The base fp score was 43 using AVX and 41 using SSE4.2, for a speedup of 4.9%.



