In our last article, we outlined the architecture of the Bensley platform with the Blackford chipset and compared it with the prior generation platform with the Lindenhurst chipset and Nocona CPU. We also looked at Intel’s estimates for the performance of fully loaded and highly tuned systems. However, few fully loaded systems are purchased, and many users simply do not have the expertise to tune their entire application stack. Consequently, many of the more prominent industry standard benchmarks represent maximums, rather than reasonably attainable performance measures.
Fortunately, Intel was kind enough to provide us with a Bensley platform development kit (PDK) for benchmarking purposes. PDKs are not actual products; they are pre-production systems for key partners to validate hardware, software, drivers, operating systems, compilers and the like. The system we received and tested is fairly modest, and should provide good insights into realistic performance. However, since this system will not be available till March of 2006, the results are an underestimate of actual performance; five months of tuning should improve the performance by a non-negligible amount.
The point of this preview is to examine how the Bensley/Blackford platform will improve over the existing Nocona/Lindenhurst platform. We do not have an AMD K8 system for comparison, nor would it be a particularly insightful; a cutting edge K8 system will likely be slightly dated by the time that Bensley arrives in the first quarter of 2006. Rather than estimating where AMD will be in the future, we will instead compare Bensley with a known quantity. The two systems we use for testing are described below:
Table 1 – System Configuration
Since we are doing a server comparison, Intel’s Hyper-Threading was enabled for both systems. The Bensley platform appears to have far more resources than the Nocona system, however, that is to be expected; the Nocona system is roughly a year old. Furthermore, the Bensley system has to support twice as many cores, so we would expect it to have roughly twice the bandwidth, memory capacity, etc. More importantly than aggregate statistics, how do the two systems compare on a per core basis?
Table 2 – Resources per Core
Bensley is certainly a step forward on a per CPU basis, with 34% more bandwidth and twice the cache (which will reduce the bandwidth needs of the CPU, by around √2).
Our methodology for testing was rather simple. Each system had its own hard drive, and we installed Microsoft Windows Server 2003 and upgraded it to SP1. Then we installed all the relevant software, in particular the JVM and .Net Runtime Beta 2.0. Each benchmark was run 4 times; the first time is to warm up the caches, and then 3 times which were measured. Our results are the averages of these three runs, and after each benchmark, the system is rebooted. Since few benchmarks are perfectly repeatable (i.e. results differ for each run), the standard deviation for the three runs are also reported. Factors that can cause variation include the operating system’s scheduling decisions, multi-threading, inter-processor communication, I/O etc. In general, the standard deviations were all very small, and therefore the results can be considered accurate. We will report the standard deviations for all benchmarks but SPECjbb2005, due to licensing concerns.
Discuss (143 comments)