CPU Benchmarks (FP)
Benchmarks that stress the Floating Point Unit seem to particularly favor the K7 core, and are considered the P4 core’s weak point. Unfortunately, I don’t have a Fortran compiler that will generate the SPEC binaries, so the test results for FP performance will be limited only to the PCMark2002 and Sandra benchmarks. While I had considered using ScienceMark, version 1 did not have any P4 ‘optimized’ binaries, though it had both PIII and K7 binaries, and version 2 was still in Beta Test status at the time I was performing these tests. I will be generating some results from ScienceMark, as well as some other benchmarks, in the future.
The Audio Conversion test from PCMark 2002 peforms some MP3 type decoding, and as you can see, the K7 dominates this test as most would expect. Willamette performs much as expected. HW Data Prefetch seems to have some impact, as does FSB speed (similar results can be seen in the “Data Prefetch Logic – What Is It Worth?” article.
The 3D Vector (PCMark 2002) test also seems to favor K7, though the PIII T does fairly well. This would indicate that the HW Data Prefetch feature is important here, which is also verified in the article on Data Prefetch. Willamette again comes in dead last.
Whetstone (Sandra) is not considered a ‘realistic’ FP benchmark anymore, though it does show here that the P4 does not have a ‘strong’ FPU vs. either P6 or K7 cores. Since it is a relatively small benchmark, it will tend to exaggerate the difference due to the cache implementations of the various processors. Note the ‘P4 – SSE2’ entries, which show the benefit of SSE2 specific code over the x87 FP. If applications were to actually take advantage of SSE2, Intel’s choice to de-emphasize the x87 FPU would not appear to be such a mistake, as many people view it today. With AMD’s Hammer implementing this instruction set, it may not be such an unlikely possibility.
The Multimedia FP test from Sandra is intended to show what impact MMX/SSE/SSE2 will have on FP performance. Notice that Willamette does much better here than in Whetstone, but it still doesn’t hold a candle to the other processor cores.
Discuss (One comment)