MAQSIP-RT: An HPC Benchmark

Pages: 1 2 3

MAQSIP-RT

One of the challenges with server reviews is finding the right benchmarks and workloads. Unsurprisingly, many server workloads involve providing a service to a client system, for instance serving up a webpage, executing a Java application, running database queries, etc. Multiple system test benches (i.e. using a client to drive the server) are more complicated to configure and require a larger time investment. Benchmarks like TPC-C tend to represent the extreme case, where a low-end configuration typically involves over a million dollars of hardware, concentrated in the storage system, but also clients and networking. The more complex the benchmark, the less likely it is to be used and there is a very real risk that a complicated benchmark may fail to generate a sufficient number of submissions to be interesting. Case in point would be TPC-App, which was retired with only a single result.

A while back one of our readers generously offered to help with a benchmark drawn from the realm of atmospheric science. Carlie Coats is the Chief Software Architect at Baron Advanced Meteorological Systems, which provides a variety of environmental modeling software. Applications of their software include real-time weather forecasting, air quality analysis and prediction, high resolution wind analysis (e.g. assessing the suitability of a site for wind power). Chances are you’ve seen some of their handiwork on your local weather channel.

Carlie and BaronAMS graciously allowed us to use a version of the Multi-scale Air Quality Simulation Platform (MAQSIP-RT) for benchmarking. MAQSIP-RT is a flexible regional model of the chemical and physical processes responsible for the transportation, transformation and deposition of particulate matter. For example, it has been used to predict ozone levels across much of the Northeastern US and issues alerts as needed.

A given area of interest is represented with a uniform horizontal grid and a non-uniform vertical grid; the latter accounts for different layers of the atmosphere. The model is quite complicated and incorporates many factors such as mixing due to wind, the impact of clouds, and various chemical reactions. A paper by the scientists and developers describes MAQSIP-RT in considerable detail for those who are interested and have a sufficient background. For those not well versed in atmospheric chemistry, Carlie has a nice summary:

MAQSIP-RT uses a microprocessor/parallel-optimized QSSA (Quasi-Steady State Approximation) solver for a 40-species modified Carbon-Bond-IV atmospheric chemistry sub-model, Bott dynamics for transport, and Kain-Fritsch-McHenry clouds and aqueous chemistry.

MAQSIP-RT primarily uses single precision floating point numbers, but some of the key constants for the chemical processes are denormals and require efficient handling. One of the advantages of MAQSIP-RT is that it has been specifically tuned for modern microprocessors. Some other prominent weather packages such as MM5 were originally developed for vector architectures (which lack caches), and have numerous artifacts of their origins in the code base. Consequently, the performance on modern systems can leave a bit to be desired.

In this first piece, we will examine the performance of MAQSIP-RT on a standard Supermicro two socket server and hopefully MAQSIP-RT will become an element of a Linux based review process at Real World Tech.

Supermicro A+ Server

For our first demonstration of MAQSIP-RT, Supermicro graciously supplied us with one of their servers optimized for scientific workloads. The A+ Server 2021A-32R+F is a two socket Istanbul based system. This is one of the first server systems based on a fully AMD platform, using the 5690 chipset and the SP5100 south bridge in a 2U chassis.

The server has 16 DDR2 DIMM slots for up to 128GB, an 8-port SAS controller from LSI and 2 on-board Intel Gigabit Ethernet adapters. The AMD chipset also has a 6-port SATA 2.0 (3gb/s) controller in the southbridge. For additional I/O, there are 4 PCI-e 2.0 slots; three configured with x8 signaling and one with x4. The physical slots are two x16 and two x8 in order to fit a wider range of cards, but the signaling limits the actual bandwidth. The aggregate PCI-e bandwidth is an impressive 28GB/s, matching half the I/O bandwidth of the CPU. Each x8 slots is a good match for up to two QDR x4 Infiniband ports or a single DDR x12 port, and can also support several 10 GBE ports. To accommodate older devices, there are also two 133MHz PCI-X slots.

The chassis has 8 hot swap 3.5” drive bays on the SAS backplane, plus two fixed 3.5” drive bays. This is a generous allotment for direct attached storage, which is essential for high bandwidth and minimizing the network bandwidth consumption for storage (as opposed to communication with other cluster nodes or client systems).

On the power efficiency front, the 720W power supply is composed of two redundant modules and is certified for 80%+ efficiency. According to the vendor testing, at a 25% load or higher, the efficiency is actually 90-93%.

The Supermicro system in question was configured with a pair of AMD’s 1.8GHz, 6-core microprocessors and 8 x 4GB DIMMs. The server booted SUSE 11 over the network and was configured without any local storage.


Pages:   1 2 3  Next »

Discuss (102 comments)