Recently, Ken Farmer of LinuxHPC.org and David Kanter of Real World Technologies had an opportunity to interview Jason Pettit, of SGI. Jason is the Product Line Manager for the SGI Altix 3000 at SGI, and has been working with various Linux based systems since 1998.
The Origins of the Altix
RWT: What is the architecture of the Altix systems, and were any additional components used in the NASA installation?
Jason: The SGI Altix 3000 server uses a near-uniform memory access architecture first introduced by SGI in 1997 in the Origin 2000 server which utilized the 64-bit MIPS processor and SGI’s UNIX based IRIX operating system. SGI updated this architecture again in the year 2000 when we introduced the Origin 3000 servers which deploy a modular “brick” based architecture that allows users to add memory, processors, and additional system I/O independently, enabling flexibility to build systems that fit their applications needs exactly. Recently in 2003 SGI introduced the SGI Altix 3000 which updated this well-proven architecture by offering the Intel Itanium 2 processor and scalable production ready Linux. Both the SGI Origin and SGI Altix systems are built around our memory channel interconnect called NUMAlink, which creates a flexible switch based fabric, which acts as the systems backplane, and is made up of routers and cables. Each NUMAlink 3 connection has an aggregate bi-directional bandwidth of 3.2GB/second. This fabric of routers and cables allow SGI Altix 3000 hardware to be configured up to 512 processors and 8 Terabytes of cache-coherent memory today. This capability in part of the standard system design so no special hardware was required for the NASA system.
Figure 1 below illustrates the architecture of each ‘node’ of an Altix system.
Figure 1 – Altix C-Brick Schematic
RWT: How much of the system infrastructure is shared between the Origin and Altix systems?
Jason: The entire system infrastructure is shared with the exception of the processors and memory controller ASIC. But unfortunately Altix and Origin hardware can not be part of the same NUMAlink connected fabric because of processor endianess, hardware discovery, and operating system differences.
RWT: What is the maximum number of processors that are used in a single system image (SSI)?
Jason: Today, the supported SSI on Altix is 64 processors, which is a wonderful technical achievement for open source community and our engineers, but as you can see by our proof-of-concept demonstration with NASA we hope to work with the Linux community to achieve much more. In fact over the last 5 months SGI has been working with a set of our users to beta test the reliability of a standard 128 processor SSI. This beta test has exceeded our expectations, and we will start supporting at least 128 processor scalability for all Altix users when we release SGI ProPack 2.4 in February.