Data Prefetch Logic – What is it Worth?

Pages: 1 2 3 4 5 6 7 8 9

CPU Benchmarks

The first set of CPU results I’ll show are from PCMark2002. Again, these tests are interesting in that they each appear to rely on different features of the processor, and once those are quantified (if possible) it may be a very useful benchmark to ‘get inside’ the architecture a bit. I am not interested in the aggregate scores for either memory or CPU, as I am not trying to determine which part is ‘better’.

Processor

PIII Cu

PIII T

PIII Cu

CeleronT

FSB

133MHz

133MHz

100MHz

100MHz

JPEG

12.1

12.1

12.1

12.1

Zlib Compression

5.6

5.6

5.5

5.4

Zlib Decompression

50.6

54.5

47.4

47

Text Search

82.5

160.3

74.9

74.3

Audio Conversion

58.1

62

55.9

58

3D Vector

37.6

48

35.6

36.2

As noted in the article ‘PCMark2002 – A First Look’, some of these tests seem to have little or no reliance on memory accesses, while others do. The JPEG and Zlib Compression tests seem pretty unfazed by either an FSB change or by the Data Prefetch logic. Zlib Decompression, Audio Conversion and 3D Vector Calculation seem to get a bit of a benefit from a faster FSB, and a bit from the Data Prefetch logic, so there seems to be some memory access going on in those tests, but not much. Text Search really benefits a great deal from the prefetch feature – almost a 100% increase – so bandwidth and throughput are extremely important here.

When you look at the two 100MHz FSB parts, you can see that the core of the processor did not change at all outside of the one feature present on the PIII Tualatin part. Since the PIII Coppermine and Celeron T are virtually identical, I did not continue to run tests on the Celeron part, so only three parts will be shown in the remaining benchmarks.

I would love to have shown some SPEC CPU2000 results here, however the Coppermine parts would not complete a full SPEC run no matter how many times I tried with various options and timings. Apparently, there are some very small errors that occur at those speeds that none of the other benchmarks detected, but SPEC did. Overclockers beware! If you want to see how stable a processor really is when overclocked, try SPEC CPU2000…

To try and ‘compensate’ for this limitation, I did run Tim Wilkin’s ScienceMark benchmarks. These are not all CPU limited, but then neither are SPEC tests and ScienceMark is the closest benchmark I can think of to SPEC. These results are linked here as separate pages (official result pages generated by the benchmark itself):

Unfortunately, I am not very familiar with ScienceMark because I have never used it before this, so readers will have to do their own interpretation and perhaps discuss them on the RWT Forum. I obviously need to spend some time analyzing the results, and possibly communicating with the author to find out more. The only thing that seems to jump out at me is that the Primordia test seems to gain a great deal of benefit from the Data Prefetch logic, while the other two tests do not. This leads me to believe that those are heavily CPU bound (run mostly out of cache), while the Primorida test benefits from higher bandwidth. One other result that looks interesting is that the Cache Latency numbers appear to bear out the results shown on the previous page from PCMark2002 – the L2 cache latency is slightly worse in the Tualatin part than in the Coppermine part.


Pages: « Prev   1 2 3 4 5 6 7 8 9   Next »

Be the first to discuss this article!