Performance Per Watt: Dempsey and Woodcrest

Pages: 1 2 3 4 5 6 7

Introduction

Earlier this year, we had an opportunity to examine the performance of Intel’s Dempsey and Woodcrest microprocessors, when coupled with the Blackford chipset. Dempsey, which is essentially two Pentium 4 cores packaged together, talking to a front side bus substantially improved the performance of Intel’s dual processors servers, although the power consumption was still rather high. Woodcrest, as previously discussed, uses the new Core microarchitecture, not only improved upon the performance of Dempsey, but also halved the power consumption. With the launch of Woodcrest, Intel has revitalized their server line-up and is looking to regain their prime position in the x86 server world.

Based on our previous analysis, Woodcrest improves performance by 11-40% over Dempsey, which is quite impressive considering the changes were all microarchitectural. However, we did not address energy and power efficiency for servers. The focus of this review is to compare the energy and power efficiency of two generations of Intel’s dual processor server platform.

Methodology

In order to obtain the most precise and useful data for comparisons, our power measurements were performed with a single system, while changing the microprocessors. This isolates any differences in memory systems, power supplies, fans, voltage regulation modules and a dozen other minor factors which complicate comparisons between different systems. The one variable that cannot be controlled for is the difference in power supply efficiency between the different load levels. An 800W power supply (such as the one used here), will run more efficiently at 700W than at lower levels. A “Starlake” reference design from Intel was used, as detailed below.


System Specifications

An Extech 380803 power meter was used to measure the current drawn by the whole system, at the wall at 1 second intervals. This means that any measurements will incorporate the inefficiency of the power supply (according to Intel the PSU is about 80-85% efficient), the activity level of the fans (cooler systems use less power for fans), as well as the actual MPU, chipset and memory, etc. While it is possible to measure the current drawn by the MPU alone (modulo the VRM efficiency), servers are rarely cobbled together piecemeal and instead tend to come as complete systems. Consequently, ‘at the wall’ measurements are more relevant and meaningful.

We will revisit all of the benchmarks previously used for performance measurement, but look at the power draw during each benchmark. In conjunction with the existing performance numbers, we will quantitatively describe the energy or power efficiency. Just as a quick note, the performance numbers for Dempsey were generated using a slightly different system, although the same processors. We have no reason to believe that the performance differences would be substantial.

Generally, most benchmarks used have one of two metrics, execution time (i.e. 73 seconds per simulation) or throughput (60 transactions per second). Sungard ACR and SPECjbb2005 are examples of each type of benchmark, respectively. For throughput type benchmarks, the most appropriate efficiency metric is throughput/watt (i.e. 5 transactions/sec per watt). The easiest way to measure efficiency for a run to completion benchmark, like Sungard ACR, is to convert it to a throughput measurement and then use throughput/watt. There are other ways, but they are somewhat more complex and beyond the scope of this article.

Pages:   1 2 3 4 5 6 7   Next »

Discuss (68 comments)