XML Test 1.1
It is our pleasure to introduce a new benchmark, XML Test 1.1, for this review (and all of our future server reviews). XML Test was designed by Sun Microsystems to evaluate the performance of XML document processing using Java and .Net. While this seems rather mundane and boring, XML processing is quite interesting because it is an essential element of many web services implementations. XML is a totally neutral document format that is an industry standard. XML processing is computationally taxing, and ubiquitous enough that some architects are considering incorporating hardware XML accelerators in the future. This benchmark models a Java server processing multiple XML documents in parallel. The documents are sample invoices for a company, of varying sizes.
There are 9 sub tests, each using a different method, which are shown below in Figure 7. Every test consists of a warm-up time period, and then a steady state measurement period. The benchmark was run with all default settings, except that the heap size was set to 1024-2040MB. The 32 bit JRockit JVM cannot allocate more than 2040MB to heap, hence the somewhat odd range. Due to the run time of the benchmark, only peak performance measurements were made. Perhaps in future reviews, data for scaling will be collected. The Dempsey system ran with 8 threads, while the Woodcrest system used 4.
Figure 7 – XML Test 1.1 Performance
XML Test generally scales quite well according to the architects, and Woodcrest substantially improves over Dempsey. The performance increase is 25-45%, across all nine tests. The gains shown in XML Test are probably somewhat of an underestimate. Since the benchmark was designed for processing multiple documents in parallel, there should be little or no sharing between different threads and processes. As a result, the benefits of the shared cache in Woodcrest may not be readily apparent. Other commercial workloads may see more of a benefit from the shared cache (relative to shared package designs like Dempsey).
XML Test is very well designed, in terms of accuracy and repeatability. The standard deviations for each test were well under 3.5 transactions/sec. This is largely because the structure of the tests, using a warm-up period and a steady state measurement period, are superior to simple ‘run once’ benchmarks.