Though my original intent was to isolate the Data Prefetch logic feature and try to determine the actual benefit in terms of performance, I think these results provide some insight into a few other issues and raise some interesting questions. For example, what is the reason for the increased latency of the L2 cache in the Tualatin part? Also, is SPEC CPU2000 a good test of processor reliability since it seemed to crash with the ‘overclocked’ Coppermine part, and does this mean the statements of ‘reliable overclocking’ in many publications may be false because they don’t use the proper tools? I would sure like to see something like this looked into, because many people like to point to these overclocked results as ‘proof’ of how much headroom a part has. If not for the SPEC problems, I might be able to conclude that the Coppermine PIII could have easily been pushed to 1.2GHz in Intel had wanted to, but now I am not convinced of this at all.

I believe that these results conclusively prove that the Tualatin based Celerons do not include the Data Prefetch feature. Though this should not have been a question, since the data sheets for both products are fairly clear on the subject, many people still seem to think that the feature exists on all Tualatin based processors. Another issue I saw raised when doing some research on the subject of data prefetch is that the benefit of this feature would be limited on the Pentium III because of the bandwidth limitations of the GTL+ bus. It seems from these results that one would very rarely, if ever, achieve saturation of the bus during normal usage. This might be more of a concern for a server, but probably not for a desktop system.

Of course, the main purpose of this test was to look at Data Prefetch, and what we see is that it will provide between 5% and 10% performance improvement for most ‘real world’ situations. Some applications may benefit even more, as the STREAM and Primordia subtest of ScienceMark indicate (almost 30% improvement in both). Using this information, it may be possible to perform some other processor comparisons and determine the effects of specific features, and perhaps allow us to make some more accurate predictions of the performance of future processors. Obviously much more research is necessary for this, and it will likely not be as easy to isolate out an individual feature as it was in this case.

