8-Socket Commodity Servers: Flourish or Perish?

8-Socket Commodity Servers: Flourish or Perish?

This is a repost of an article first published for the Exectweets campaign.

Most servers today are 2-4 socket systems (2-4S) – this is the sweet spot for both Intel and AMD. There are very few vendors who can design the custom chipsets (the glue) needed to scale up Intel systems to 8 sockets (8S). The most notable is IBM, who has developed four generations of scalable chipsets and is about to release a fifth. We’ve previously discussed the X3 chipset which is largely similar to the current generation X4. Unisys and NEC also have 8S systems, albeit with lower performance. In theory, there are 8S AMD systems, but the scalability suffers because most processors only have 3 HyperTransport links.

The 8S+ market is small, reportedly ~1800 servers per quarter, or roughly 57K processors per year. One factor is that few vendors sell such systems and they are very expensive, another is that the x86 software stack was considered immature till recently. Many IT shops are more comfortable with Oracle or SAP on a UNIX server, rather than using Linux/x86.

The landscape is changing substantially though, with the release of Intel’s Nehalem-EX microprocessor, which has 4 QPI links and can scale to 8 sockets in commodity servers. Previously there were two vendors with proprietary 8 socket servers; Intel has announced they have >15 design with 8 OEMs for 8+ socket servers. The obvious candidates are IBM, Unisys, NEC, HP, Dell, Oracle nee Sun, Fujitsu and probably Cisco or Supermicro. This will create an open and competitive market for 8S and above servers, which should bring prices down substantially. The fact that so many OEMs are on board strongly suggests that customers are interested in 8S commodity servers, but until there are real sales numbers, the jury is out.

The primary barriers to adoption for large x86 servers are software, maturity and cost/benefit. Scalable applications that would benefit from 8S servers are not common. Some classic examples include I/O heavy workloads like ERP, transactional or analytic databases and also select HPC workloads that favor shared memory rather than message passing. More recently, server consolidation using virtualization has emerged as an important workload. In 2010, there are simply more scalable workloads than were previously available.

Historically Linux and Windows were not considered sufficiently scalable or reliable for high-end servers. But both Microsoft and the open source community have put in a lot of effort to change both perceptions and reality. Windows is limited to 256 logical CPUs, but SGI has spent considerable effort getting Linux to scale to thousands of processors. Reliability for most scalable workloads is essential; it’s one thing to have a web server go down, it’s quite another to lose an order processing system. Fortunately, Linux and Windows have substantially matured and been accepted as key enterprise ingredients. For example, Oracle now uses Linux extensively for high-end databases.

The cost/benefit analysis is still one of the trickiest areas for large servers. The chief benefits are management and scalability. Managing a single, highly reliable server is undoubtedly easier than trying to handle 2 or 3 or even a dozen smaller systems and the associated networking equipment. While management suites from OEMs make this easier, they are used to lock-in customers, an expensive proposition.

From the scalability side, many key workloads cannot (or are very expensive) to parallelize across a cluster and required shared memory. Many workloads like ERP or OLTP can be limited by memory, storage or networking. Scaling up I/O, memory and CPUs is the biggest benefit for a large server. Even in workloads that can be parallelized (e.g. analytic databases) the cost needs to be compared to that of a single larger server. Currently, if a database needs 2TB of memory to run efficiently, the only x86 solution is a clustered database; a single 8S Nehalem-EX server could handle a 2TB database by itself. The flipside of the scalability question is the impact of multiple cores; Nehalem-EX is an 8-core processor. How many workloads truly need 64, 128 or 256 cores?

With all these factors coming into play, it is hard to predict whether 8-socket servers will find a lasting place with most businesses. Certainly, the trend towards multiple cores tends to reduce the need for high socket count servers – leaving reliability, I/O and memory as key motivators for large servers. At the same time, Intel will enable high performance, commodity 8-socket x86 systems for the first time in history. The commodity aspect to 8 socket servers is key, as the prices should finally approach a reasonable level that businesses are comfortable with. If the prices are in-line with 4 socket servers, rather than exponentially more, then it may very well make sense to buy larger servers. Fundamentally, Nehalem-EX will be the first processor to bring substantial benefits to the table while reducing the cost side of the equation. This could be the catalyst needed for customers to start seriously evaluating and purchasing 8-socket servers.

Bringing out the crystal ball, my guess is that 8-socket servers will become much more common – possibly doubling or tripling the number shipped each quarter. They will continue to be much more rare than four socket servers, simply due to price and needs, but it shouldn’t be hard to find a place to fit an 8 socket server in a modern enterprise.
Discuss (30 comments)