Article: PhysX87: Software Deficiency
By: David Kanter (, July 9, 2010 9:25 am
sJ ( on 7/9/10 wrote:
>David Kanter ( on 7/8/10 wrote:
>>Ralf ( on 7/8/10 wrote:
>>>Null Pointer Exception ( on 7/8/10 wrote:
>>>>Probably not relevant. If PhysX is written in C as suggested, proper compiler options
>>>>alone can handle *most* of the conversion from x87 to SSE. Naturally, this excludes
>>>>any hand-tuned assembly modules or other special sauce.
>>>>While it takes a focused development effort to get the maximum performance improvement
>>>>from vectorized FP, the "free" jump is several keystrokes and some basic QA---and
>>>>history has precious little to do with it.
>>>NVIDIA also delivers PhysX source code to premium license partners, so isn't it
>>>up to the game developers to enable SSE flags for their games?!
>>It's also nvidia's responsibility to ensure that those who get binaries get ones
>>that are sensibly compiled, since not all partners get source. NV's decisions
>>are responsible for making their own products look good (or bad).
>Our game, quite FP intensive as they normally are, gains only about 10% in performance
>when compiled with SSE2 instructions (does not use intrinsics), and loses about
>5% of current user base that has old (mostly AMD) CPUs >without SSE2.

Thanks for the data point! Which compiler were you using? ICC?

>As this is likely the normal use scenario, correct choice >for distribution is x87,
>not SSE2 and providing source for those who need the last >mile out of the source.
>It is also what we have chosen for our products.

It sounds like this is a pretty casual game and not a major high-end title, is that right?

>Another method of delivery would naturally be an Intel->compiled multi-architecture
>binary, but at least earlier that used to bias towards >Intel CPUs quite heavily.
>But then again, that would not necessarily be altogether >unwanted at NV these days.

I think Nvidia's more worried about Intel than AMD...ATI may be a direct competitor, but Intel is the one who is going to integrate GPUs most aggressively (and try and prevent/coopt HPC use of GPUs).

