Scaling Through Design Technology Cooptimization
Roughly a decade ago, Intel kicked off the FinFET era with the 22nm process and roughly five years later foundries like TSMC followed suit. In some ways, the Intel 4 process is the end of this era – it is the penultimate node from Intel to use FinFETs before shifting to gate-all-around transistors or RibbonFETs.
The Intel 4 node is a high-performance focused process and the first for the company to adopt EUV. The primary target for Intel 4 is the compute tile in Meteor Lake, which features both large Redwood Cove cores that maximize per-core and per-thread performance and smaller more energy-efficient Crestmont cores. The Intel 4 process will not be used to manufacture graphics and omits certain features as a result. In particular, Intel 4 only includes tall standard cell libraries that are optimized for high-performance, and omits the shorter standard cell libraries that emphasize high density. As a result, Intel 4 is therefore most directly comparable to the tall standard cell libraries on the Intel 7 node that were employed for the Golden Cove and Gracemont cores in the Alder Lake processor family.
Table 1 compares various density metrics across several generations of Intel process technology. Note that for the 22nm and Intel 16 nodes, the first metal layer is bi-directional and therefore serves the same role as M0 and M1 in other nodes. Additionally, most process nodes offer two gate pitches, for Intel 7, the 60nm gate pitch is emphasized because that is the transistor used for the high-performance libraries in the tall standard cells. Overall, Intel claims a greater 2X improvement in density for the high-performance logic library compared with the prior generation Intel 7, which is in-line with traditional Moore’s Law scaling. However, that density is accomplished through a combination of physical shrinking and other scaling enhancements such as design technology co-optimization (DTCO) as illustrated in Figure 1.
As with all logic process nodes, the critical dimension scaling has slowed down considerably. The fin pitch, which partially determines the library height in the Y-axis has scaled by 0.883X and is the same pitch as the M0 layer. In the orthogonal direction, the contacted poly pitch (CPP, sometimes referred to as contacted gate pitch) has scaled by 0.93X or 0.83X depending on the baseline. The Intel 7 process included both a denser 54nm CPP and higher drive current 60nm CPP, although the latter is used for the CPU cores in Alder Lake. The physical scaling for the Intel 4 process alone is 0.735X, a far cry from the claimed 0.5X scaling.
To achieve the full 2X density, Intel adopted DTCO in addition to sheer physical scaling. As Figure 1 shows, the height of the standard cell decreased from 12 fin pitches in Intel 7 to 8 fin pitches, a reduction of 0.666X in addition to the intrinsic shrinking of the fin pitch. This compaction comes from both fin depopulation and tighter layout rules. The Intel 7 library is constructed with 4 fin pitches available for both NFETs and PFETs, while the Intel 4 library relies on higher transistor performance (on a per-fin basis) to reduce the number of fin pitches from four to three. Additionally, the Intel 7 library separates the NFETs and PFETs by two fin pitches, while the Intel 4 library allows for more precise single fin end-to-end (ETE) spacing between the transistors.
The transistor performance and density improved with a second generation contact-over-active-gate (COAG) flow; and undisclosed improvements in mobility and external resistance boosted drive strength. It is likely that the reduction in external resistance stems from optimizing the contacts and will be discussed later. As Figure 2 shows, the Intel 4 process achieves impressive drive current exceeding 2mA/µm for both NFETs and PFETs at 0.7V and 20nA/µm leakage. The higher drive current transistors in Intel 4 enable reducing the number of fins in the high-performance libraries thereby improving effective density.
As Figure 3 illustrates, the Intel 4 process operating at 0.65V delivers 21.5% greater performance at constant power compared to Intel 7, or equivalently 40% lower power at constant frequency. This frequency versus power curve was measured for a licensed CPU core and is a good proxy for the overall performance or efficiency gain offered by the new process technology.
Intel 4 offers 4 threshold voltage (Vt) options (8 total for NFETs and PFETs) that cover a wide range from roughly 237mV to 450mV and -237mV to -390mV as shown in Figure 4. The ultra-low Vt option boosts the top frequency by 5% at higher operating voltages compared with a 3 Vt palette. It is possible that Intel 3 will offer an additional Vt level for an even wider operating range and greater frequencies.