Recent Articles

Parallelism at HotPar 2010

By Tarek Chammah | 2010.07.26The 2010 HotPar workshop had a variety of papers focusing on the software and programming aspects of parallelism. Highlights include parallelization of the Firefox browser, Michael McCool's approach to parallel building blocks and Electronic Art's Cascade system for handling state in video games. More hardware centric topics include a controversial view that parallelism is irrelevant and the limits of GPU performance.
More >>

PhysX87: Software Deficiency

By David Kanter | 2010.07.05PhysX is a key application that Nvidia uses to showcase the advantages of GPU computing (GPGPU) for consumers. PhysX executing on an Nvidia GPU an improve performance by 2-4X compared to running on a CPU from Intel or AMD. We investigated and discovered that CPU PhysX exclusively uses x87 rather than the faster SSE instructions. This hobbles the performance of CPUs, calling into question the real benefits of PhysX on a GPU.
More >>

MAQSIP-RT: An HPC Benchmark

By David Kanter | 2010.05.03In this article, we test out a new HPC benchmark from one of our readers on an Istanbul server from Supermicro. MAQSIP-RT is a forecasting and analysis package that is commonly used throughout the weather and atmospheric chemistry communities. In our first run, we take a look at scalability and performance and find a benchmark that suits many of our needs.
More >>

Westmere Performance

By David Kanter | 2010.03.25Westmere is a shrink to the 32nm process and has 50% more cores, 50% more last level cache and several other improvements we detailed in our first article. In our second article on Westmere, we take a look at the performance of the Westmere-EP product, targeted at 2-socket servers. We compare the performance of Westmere to the socket compatible prior generation Nehalem microprocessors, using the same server and same frequency parts to see the actual benefits of Westmere.
More >>

Westmere Arrives

By David Kanter | 2010.03.17Intel recently launched the 32nm Westmere microprocessor. In Intel's parlance, Westmere is a "Tick", re-using Nehalem's microarchitecture and moving to a new manufacturing process: the Xeon 56xx series is the result. We present a technical overview of Westmere, examining the improvements relative to the prior generation and in a follow-on review, we will show our own performance and power data.
More >>

8-Socket Commodity Servers: Flourish or Perish?

By David Kanter | 2010.03.08In the past, 8-socket x86 servers were proprietary, expensive and unpopular. With the imminent release of Nehalem-EX, 8-socket commodity servers will be a reality. The software ecosystem has matured, leading to more scalable applications, but at the same time, core counts have climbed dramatically. Is there room for 8-socket x86 servers in the market?
More >>

NAND Flash: A Classic Disruptive Technology

By David Kanter | 2009.12.30NAND flash is a welcome innovation in the storage market, but most analysis has centered around performance advantages for solid state drives (SSDs). This overlooks the equally or more important fact that NAND flash can be a less expensive solution than a hard disk for certain low-capacity applications. This is the classic hallmark of a disruptive technology.
More >>

Larrabee 1 Defers Graphics, Bins Rendering

By David Kanter | 2009.12.04Larrabee is Intel's unique architecture for a family of throughput processors, developed for the graphics and HPC markets. We have recently learned that graphics products based on Larrabee 1, the first implementation, have been canceled and that it will instead be used as a software development vehicle. Larrabee's troubles lay in software, and now the question is what lies ahead in the future for Larrabee and Intel's graphics products.
More >>

Inside Fermi: Nvidia's HPC Push

By David Kanter | 2009.09.30In the last several years, the landscape for computing has become increasingly interesting and diverse. GPUs have gradually evolved to be less application specific and slightly more generalized than their fixed function ancestors. The changes started in the DirectX 9 time frame, with real floating point (FP) data types, but still fixed vertex, geometry and pixel processing. DX10 hardware was really the turning point with unified shaders, relatively complete data types (i.e. integers were added) and slightly more flexible control flow. Today the high-end is a four horse race between AMD nee ATI, Intel’s and AMD’s integrated graphics and Larrabee, and Nvidia. All four face different goals, constraints and hence have taken slightly different paths. It is in this context that Nvidia has announced a next generation architecture, Fermi, which aims for even greater performance, reliability and programmability; unlocking even more software capabilities.
More >>