System Line-up and Methodology
For this preview, we used two PDKs: a newer Woodcrest system and a Dempsey system from our prior review. PDKs are pre-production systems for key partners to validate hardware, software, drivers, operating systems, compilers, etc. The system we received and tested is fairly modest, and should provide good insights into realistic performance. This system will be released in June, so the results are an underestimate of actual performance; a month of tuning will improve the situation somewhat.
The point of this preview is to examine how the Woodcrest and Blackford platform will improve over the existing Dempsey and Blackford platform. We do not have newer socket AM2 Opterons for comparison, although some estimates can be made. The two systems we used for testing are summarized below:
Since we are doing a server comparison, Hyper-Threading was enabled for the Dempsey system. The snoop filter for Blackford was disabled in the BIOS.
Our methodology for testing was rather simple. Each system had it a pair of hard drives, and we installed Microsoft Windows Server 2003 EE (64 bit) on one drive, the other drive was used for applications and data. Then we installed all the relevant software, in particular the JVM and .Net Runtime 2.0 (64 bit). Most benchmarks were run 4 times; the first time is to warm up the caches, and then 3 times which were measured. All tests were run in 64 bit mode, except for the Java benchmarks, which used 32 bit JVMs due to inter-operability concerns. Certain benchmarks such as SPECjbb2005 and XML Test 1.1 do not require warm up runs. Our results are the averages of the three valid runs, and after each benchmark, the system is shut down. Since few benchmarks are perfectly repeatable (i.e. results differ for each run), the standard deviation for the three runs are also reported. Factors that can cause variation include the operating system’s scheduling decisions, multi-threading, inter-processor communication, I/O etc. In general, the standard deviations were all very small, and even the smallest measured differences were statistically significant.