PCMark2002 is basically a component level benchmark developed and distributed by MadOnion. Many of the algorithms used are based upon publicly available source code, however they appear to have been chosen for some specific attributes (such as working set size) and may not be the best representative examples of how a real world application will perform. According to the product description, it is designed to be a ‘unified’ benchmark to test PCs on any platform, specifically geared towards the home and office user (laptops, desktops and workstations). Last month, Extreme Tech published an article on PCMark2002, and posed the question of whether it really is a good CPU benchmark, given the complexity of today’s microprocessors. Based upon my initial review, the results are somewhat interesting, and I do think there is some promise here if the results are used (i.e., interpreted) correctly.
This is not a full-scale investigation into PCMark2002, but is a quick look at some results from some in-depth testing I’ve been performing on Pentium III and Pentium 4 processors for an upcoming article. The testing methodology I have used restricts the comparison to CPUs and memory, so the results shown here do not include hard drive and video card performance numbers. It is interesting to note that these hard drive and video card numbers remained fairly consistent across platforms (generally within a few percent), so the isolation of components seems to have been done reasonably well.
The first thing to understand is that the results from this benchmark cannot be directly compared to application level benchmarks, such as the Winstone and SysMark suites covered here over the past few months. These application benchmarks are intended to test the performance of a complete system, and it is difficult, at best, to identify the effects of specific components that require changes other than the actual components being compared (such as chipsets and memory changes when testing competing CPUs). PCMark2002, on the other hand, isolates each component as much as possible, so the effects of cache size and speed, CPU speed and FSB speed are readily apparent, as will be seen later.
While PCMark2002 is intended to isolate components for exactly this purpose, the question that arises is whether these results can be related to ‘real world’ performance in any meaningful way. There are many features implemented in modern CPUs that are based upon how real applications reference memory, so isolating the CPU performance from memory accesses may not provide an accurate picture of how real applications will run. It may be possible to use both the CPU and memory results to come up with an estimation of how the overall system will perform, but I have not spent enough time with the results of this and other benchmarks on the same systems to determine that yet.
PCMark2002 focuses on the four major components or subsystems in the PC that affect performance: CPU, memory, hard disk I/O and graphics. The graphics subsystem can be tested for either performance, quality or both. One additional set of tests (called the ‘Crunch’ tests) are intended to show how the system performs under a ‘full load’. It is also possible to change the drive location for temporary files, and to select the number of times the tests are performed. If more than one iteration is chosen, all results are averaged, which is a more desirable method than the way Winstone reports scores (which is to report the highest score of 5 iterations). One complaint I do have is that none of these settings are saved, so each time you start the benchmark you have to respecify your preferences. I did not realize this the first few tests, so only one iteration was being performed. Rather than re-set up my test systems, I decided that for a ‘first look’ using numbers from a single iteration of the tests would be good enough. Were I using it for a real comparison of processors, chipsets, hard drives, etc. I would have set the loop counter to at least 10 so as to generate numbers that would presumably average out to the ‘typical’ case.
As mentioned, these results came from a project that should result in an article about PIII and P4 processors, hopefully in a few weeks, using Winstone and SysMark benchmarks. I decided to run PCMark2002 tests primarily due to the ExtremeTech article mentioned above, as I like to see things for myself. While comparing results from P4 tests that included both Willamette and Northwood cores, I noticed some interesting trends. In order to get a little more information, I then tested several 300MHz processors (PII, Celeron 300A and Celeron 300) to see what fell out. See the System Disclosure page. Let’s take a look at the results…
Be the first to discuss this article!