History of Niagara
Two years ago at Hot Chips 16, Sun Microsystems disclosed Niagara, an innovative microprocessor and system design that represented a radical departure from traditional computer architectures. The roots of Niagara lie in Hydra, a research project under Professor Kunle Olukotun that was working on chip multiprocessing in the late 1990’s. The Hydra project, much like the DEC Piranha, was targeted at workloads that were rich in thread level parallelism (TLP), but not instruction level (ILP) parallelism, such as network processing or commercial server workloads. Both groups proposed sacrificing single threaded performance for the sake of maximizing the number of cores on a single die. After concluding the research project, Kunle started Afara Websystems to commercialize the efforts of the Hydra project in a SPARC based implementation. Like many start ups in the early part of this decade, Afara experienced cash flow difficulties, and was acquired by Sun Microsystems in 2002 for an undisclosed sum.
After the acquisition, the Afara design underwent minor adjustments to plug a hole in Sun’s product portfolio, and to target a 90nm Texas Instruments process. Niagara came to market under the UltraSPARC T1 moniker with much fanfare in late 2005. While each processor core in a Niagara system is rather unimpressive, collectively the system provides good performance for highly parallel workloads. Niagara based servers are marketed under the name Cool Threads, and run at low power by virtue of the low clockspeed (1-1.2GHz) and high degree of integration. Moreover, the system design is easier because the temperature and power variance across different workloads is very slight due to the simplicity and high utilization of each core.
While Niagara is a novel and highly efficient server MPU, the microarchitecture and underlying philosophy explicitly give up general purpose use in exchange for high performance on specific workloads. Niagara focuses on what many consider entry level applications: dynamic web serving (and encryption), mail, Java or lightweight database applications. While these target workloads constitute a large proportion of server unit shipments, they are under encroachment (or dominated) by x86 based servers using Windows or Linux. However for many customers, the benefits that Niagara brings to the table, such as the popular, reliable and robust Solaris 10 operating system and low power consumption are convincing. Niagara based systems are selling very well and quite a few customers are first time Sun buyers, and not just users upgrading their aging SPARC systems.

Figure 1 – High Level Comparison of Niagara I and II
This year at Hot Chips 18, Greg Grohoski of Sun revealed Niagara II, the successor to their line of highly threaded processors. Niagara II is designed for TI’s 65nm process and uses 1831 pins, 711 for I/O and the remainder for power and ground. According to Ashlee Vance of the Register, Niagara II will debut at 1.4GHz, a modest increase over the first generation. Niagara II is philosophically similar to its predecessor, however, the designers concentrated on using the additional space to alter the trade-offs in the microarchitecture and go after broader markets. To some extent this is a tacit acknowledgement that Niagara’s designers faced some very difficult decisions and opted to remove (or at least postpone till the next generation) some features. Given that Niagara I is a 378mm2 chip (which was 38mm2 over target, after a diet) and dominated by logic, it is very likely that a much larger die would have caused yield problems and hence some computational resources were removed or omitted.
The design objectives for Niagara II were to double the throughput and enhance single threaded performance while reducing or maintaining the same thermal and power envelop. These improvements largely came from doubling the thread count, increasing per core execution resources and overhauling the general system structure and integration.