Intel’s High Performance 32nm
As is customary, Intel’s technology and manufacturing team returned to IEDM 2009 with an update on their 32nm high performance bulk process. Paper 28.4 is an elaboration on the previously disclosed Intel 32nm process that contains newer performance results that show the benefit of addition time for optimization, simply due to gap between timing of IEDM paper submissions and a ramp into production. Additionally, there is further discussion of how Intel’s performance on 32nm was achieved and some of the trade-off between SRAM cell size and minimum operating voltage.
One of the issues with scaling to smaller nodes is that the source and drain area also shrinks, which increases resistance and can degrade transistor performance. The paper disclosed that Intel’s 32nm NMOS transistors have a raised source and drain, which reduces resistance and enhances performance. Figure 1 shows the raised source and drain on 32nm NFETs and PFETs. Note that the PMOS transistors already had a raised source and drain due to the use of embedded silicon-germanium (SiGe), so the novel aspect is that this optimization has been applied to the NFETs.
Figure 1 – Raised Source and Drain for 32nm NFETs and PFETs
The Ion performance reported in this paper is roughly 5% higher than the previous paper, with 1.62mA/um and 1.37mA/um for NMOS and PMOS respectively, at 1.0V and 100nA/um Ioff. The drive currents in the linear region (as opposed to the saturated region) measure performance when the transistor is not fully switched ‘on’ and substantially changed compared to the slightly older 32nm paper. The reported Idlin for NMOS and PMOS are 0.231mA/um and 0.240mA/um, which is the first time that linear PMOS drive strength has exceeded NMOS. In comparison, the numbers reported in the previous 32nm were 0.228mA/um and 0.198mA/um; so the PMOS linear performance improved by an impressive 20% over one year, and 34% compared to the 45nm process. While not as widely reported, the linear drive strength is critical to overall circuit timing and performance because many transistors used in modern chips are operating in the linear (rather than saturated) region.
The last part of Intel’s 32nm paper dealt with variation and the impact on SRAM design and power efficiency. Variation is an unavoidable, but often painful, aspect of semiconductor design. As process geometry scales down, the tiny variations in doping and even drawn features have a relatively larger impact on Vt. When designing an SRAM array the supply voltage is set to ensure correct operation in the statistically worst case scenario; this also means that larger SRAM arrays must have higher voltages.
Figure 2 – Impact of Cell and Array Size on SRAM Minimum Vcc in 32nm
One common tactic with SRAMs is using larger cells, which have both higher performance and are less sensitive to variation. This is readily visible in most CPUs, where the cells in the L1 or L2 cache may be much larger than the L3 cache. The paper compared the minimum operating voltage for the three different 32nm SRAM cells used at Intel: 0.171um2, 0.199um2 and 0.256um2, which respectively required 0.7V, 0.85V and 0.95V to achieve correct operation. Additionally, they showed that a 91Mbit array requires 0.86V versus 0.79V for a 3.25Mbit array using the same 0.199um2 cell, highlighting the fact that substantially different sized SRAM arrays cannot be directly compared. Finally, they concluded by observing that Intel’s 4.2Mbit/mm2 SRAM array density (which accounts for SRAM cells, sense amps and control logic) is superior to all reported 28nm and 32nm processes.