By: Jukka Larja (roskakori2006.delete@this.gmail.com), October 5, 2015 9:15 am
Room: Moderated Discussions
Contrarian (Whatever.delete@this.hotmail.com) on October 4, 2015 11:53 am wrote:
> Alright, we know AVX2 is easy to implement, but is AVX2 a waste of die space?
>
> Do games use AVX2 and for those games that do use AVX2 do they get a significant speed up?
No, and no, though that depends on how you define significant.
I'm sure some engines and middleware implement AVX2 paths, but the bottom line is that you can't expect to have AVX2 instructions available (you can expect SSE2, since it's part of x86-64, but that's about it). Most games still ship only 32 bit binaries, just to give some idea about where we stand today.
You want your game to run on lowish GHz, non-AVX2 dual cores, yet enthusiasts will have quad-cores at 3+ GHz. AVX2 just isn't all that important, since it doesn't bring minimum requirements down and you already get very near the top with quite affortable processor. Using only 40 % of CPU instead of 50 % isn't very useful. And while 4 float wide SSE and AVX can be used to speed up normal vector3 calculations (somewhat), 8 wide AVX2 often requires you to rethink your code. Not something you want to do to speed up some marginal cases.
-JLarja
> Alright, we know AVX2 is easy to implement, but is AVX2 a waste of die space?
>
> Do games use AVX2 and for those games that do use AVX2 do they get a significant speed up?
No, and no, though that depends on how you define significant.
I'm sure some engines and middleware implement AVX2 paths, but the bottom line is that you can't expect to have AVX2 instructions available (you can expect SSE2, since it's part of x86-64, but that's about it). Most games still ship only 32 bit binaries, just to give some idea about where we stand today.
You want your game to run on lowish GHz, non-AVX2 dual cores, yet enthusiasts will have quad-cores at 3+ GHz. AVX2 just isn't all that important, since it doesn't bring minimum requirements down and you already get very near the top with quite affortable processor. Using only 40 % of CPU instead of 50 % isn't very useful. And while 4 float wide SSE and AVX can be used to speed up normal vector3 calculations (somewhat), 8 wide AVX2 often requires you to rethink your code. Not something you want to do to speed up some marginal cases.
-JLarja