The SPECpower_ssj2008 Benchmark
The first implementation of the SPECpower methodology is SPECpower_ssj2008. This workload is designed to measure server-side Java power efficiency and performance for small and medium sized servers at graduated utilization levels (active idle mode and then in 10% steps up to full utilization), and is based on SPECjbb2005 with some necessary changes.
SPECpower_ssj2008 is intended to test the CPU, caches, memory, system architecture, JVM and some OS components. Like its ancestor, ssj2008 is designed with a small I/O component. This ensures that the benchmark can be scaled up from one to four sockets easily; without the expense of massive storage arrays that are common in transactional or analytic database workloads and in a reasonable period of time. Ultimately, that is an implementation specific choice, but one that is logical and encourages widespread use. The cheaper and easier it is to run a benchmark, the more likely it is to be routinely used.
Running the benchmark requires a server (the test system), a suitably accurate power analyzer, thermal sensor and a control system (which could be a notebook). This is realistically about the smallest feasible configuration, since the control system must be separate from the workload that it is driving and controlling. Figure 1 below shows the architecture for the test setup. The Control and Collection System (CCS) is a java application which runs on the control system and orchestrates the testing, data collection and logging.
Figure 1 – SPECpower_ssj2008 Architecture
SPECpower_ssj2008 has very rigorous run rules and requires that the full disclosure report contain comprehensive information about the system under test. The disclosure report describes pretty much every component of the system.
SPECpower_ssj2008 currently has 26 submissions from several different vendors. The vast majority used modern Intel CPUs, although one submission was for a 3.6GHz Xeon based on the Nocona core. Confirming what everyone knew, the older 90nm P4 based Xeon is about 1/8th the efficiency of newer systems using Intel’s 45nm Xeon’s.
In general, SPECjbb tends to strongly benefit from cache (although ssj2008 was tuned to be less dependent on caching) and most JVMs prefer uniform memory latency – so it should be no surprise that Intel easily outperforms AMD. It is to AMD’s credit that they were willing to participate, given the degree to which their performance lags behind Intel (roughly a factor of 3X, 698 vs. 203). AMD’s sole submission uses dual-core Opterons, rather than the newer Barcelona – it would be nice to see tests for AMD’s newer server processors.i
While SPECpower_ssj2008 is a great step forward, it is not perfect (no benchmark is!). The detailed system configuration details are excellent, but a good step might be to require certain minimum system configurations or a given configuration. For instance, Intel’s submissions all used 4 FB-DIMMs to minimize power consumption, while systems configured for maximum performance on SPECjbb2005 use 8 DIMMs or more. Similarly, every submission used a single hard drive to lower power consumption; in practice the vast majority of servers (even application servers) are equipped with at least two hard drives to support RAID1 or other redundancy schemes. On the other hand, it is very easy for an end-user to estimate how much additional power a hard drive or DIMM will consume, since their power consumption is constant. The only factor that varies much, the CPU’s power, is precisely what is being measured in SPECpower_ssj2008.