This is just a first cut at evaluating these benchmarks, and much more data is necessary to come to any accurate and verifiable conclusions. In particular, it would be interesting to see how PIII, Celeron and P4 processors fare under the same scrutiny I have given to the Athlon/Duron processors. If the profiles at various CPU and FSB speeds, as well as memory and cache sizes remain similar, this would be useful information when evaluating different components. If, however, there are differences, it would be necessary to perform additional analysis to determine what features of those processors are causing the difference and then evaluate whether that relates to real world experiences.
However, even with the limited data provided here, there are some conclusions that can be made. First and foremost is that, as I suggested earlier, these tests are most suitable for identifying the performance differences of complete systems, rather than individual components. For example, comparing a P4 system with either SDRAM or DRDRAM against an Athlon system using DDR introduces so many variables that it becomes impossible to determine which one is responsible for any performance differences. The use of component level benchmarks might provide some insight, but even that requires much work and analysis.
On the other hand, we can see that both Business and Content Creation Winstone like faster processors, and get close to a linear improvement as speeds increase. Therefore, if two processors with different feature sets are tested (such as P4 and Athlon), and one scales better or worse than the other, we can make some reasonable conclusions about the relative value of those features in real world situations.
Content Creation Winstone is obviously more dependent upon memory bandwidth and latency than Business Winstone, since it shows more ‘sensitivity’ to changes in FSB speed, cache size and memory size. While one might claim that this makes Content Creation Winstone a better test of differences in the memory subsystem, it would only be true Content Creation Winstone reflects real world usage. It is probably fair to assume that even in a typical office environment, applications other than the MS Office apps are used. For example, office workers using MS FrontPage may maintain internal web sites, and these same people may convert internal documents and graphics to Adobe formats.
One result that might be a little surprising is that with the exception of the effect of going from 128MB to 256MB of memory, the profiles of both benchmarks are very similar. With additional data points and a greater variety of components it would be very interesting to see if this trend continues. Assuming this is the case, it might make reviewer’s jobs a little bit easier as both benchmarks would not need to be run all the time.
Hopefully, the information provided here will assist readers in independently interpreting the results of various publications using these benchmarks, as well as identifying the applicability of the tests to their own usage. Obviously, much more investigation is in order, such as identifying how much impact features such as MMX, SSE and 3dNow! Have on the results, as well as how the data from additional component testing fits in with that presented here.
Be the first to discuss this article!