Not All Transistors are Equal
Another challenge to using transistor count or density as a metric is that it is ambiguous and potentially misleading. Typically, we think of transistors as the physical implementation of logical blocks and circuits. For computation, this could be anything as large as CPU core or floating-point unit to something as small as an inverter. For storage, this might be a cache, a register file, a content-addressable-memory (CAM), or an SRAM bit-cell. For analog or I/O, this could be a PLL or an off-chip transmitter or receiver. The transistors that physically implement these blocks are referred to as active transistors (which are distinct from schematic transistors). In reality though, not all transistors are created equal and modern chips are manufactured with many transistors that are not active. The transistors formed during the manufacturing process are descriptively known as layout transistors. The layout transistors include the active transistors, as discussed above, but also dummy transistors and transistors used as decoupling capacitors.
Dummy transistors are inserted into a design to improve yield. For example, certain annealing and etching steps in the manufacturing process work best on a relatively uniform surface, and inserting extra transistors into empty areas improves the uniformity and therefore the yield. For many analog circuits, these extra transistors are necessary to achieve the desired performance. As another example, modern FinFET performance varies based on the stress on the transistor, which is a function of the other nearby transistors. Achieving the right performance may require placing transistors nearby to obtain the right stress.
While dummy transistors are commonly used, they are not particularly numerous. In contrast, decoupling capacitors built from MOSFETs (or decap cells) are used extensively. Generally speaking, the logic in modern chip designs never achieves 100% areal efficiency. For all the marvel of modern design tools, there is typically empty whitespace between individual logic cells (e.g., a NAND gate), between functional units (e.g., the L1D cache), and between entire IP blocks (e.g., a CPU core). Whitespace is a consequence of the tools struggling to follow design rules that ensure yield and frequency, use the available resources (such as routing layers), and piece together an electrical engineering jigsaw puzzle of logic cells, functional units, and blocks. Whitespace can account for 10-25% of a design. To ensure yield, the die must be relatively uniform and the whitespace cannot be truly empty. Many designs will fill the whitespace with decap cells to provide decoupling capacitance for power delivery and thereby improve operating frequency. In addition, some designs will place decaps within standard cell libraries. Decap transistors are the dominant source of non-active layout transistors, but hard data is difficult to obtain.
Our friends at TechInsights perform circuit-level analysis that includes the number of active and layout transistors for small portions of a die. They were kind enough to share some of these analyses for a handful of 7nm SoCs. The data is based on a small number of sample locations within each SoC, typically the GPU, which will have the greatest transistor density. They found that in the small sampled regions that the active transistors were between 70-80% of the total, and the remaining 20-30% of layout transistors were decap and dummy devices. These numbers are based on limited samples, because this analysis is fairly expensive and time-consuming. To confirm and elaborate, we gathered numbers on several modern designs and found that active transistors are commonly 63-66% of the layout transistors and that 33-37% of the layout transistors are decap cells. The TechInsights numbers are probably low because they are primarily looking at the densest logical portions of a SoC, rather than including whitespace which would include more decap transistors.
The data makes it abundantly clear that there is often a big difference between the number of active and layout transistors in a chip. Unfortunately, many companies do not specify which number they are quoting. The data on AMD and Nvidia processors from Table 2 are all from technical papers. Based on discussion with these two vendors, the numbers are active transistors as described in the last column. Based on some informal discussions, it appears that the HiSilicon Kirin 990 5G number may actually be layout transistors, which would help explain the discrepancy between these designs. It is unclear whether Apple’s A13 is implemented using 8.5 billion active transistors or layout transistors. The former would be an impressive achievement in density.
It doesn’t seem reasonable to count these dummy and decap transistors in the same way as active transistors. Active transistors implement the functions and features that customers value, whether it is CPU cores, power gating to improve idle power, neural network accelerators, or cache. However, decap and dummy transistors are overhead and don’t directly add value, and in some cases are worse than more sophisticated technologies. For example, IBM’s deep trench capacitors are far superior to decoupling capacitors and enable building dense eDRAM and reducing system cost. Similarly, Intel’s FIVR boosts platform efficiency and relies on MIM capacitors and virtually eliminates the need for decoupling capacitors in the package and board and likely reduces on-die decap transistors as well. In both cases, reducing the number of decap transistors is a benefit. The central point of Moore’s Law is creating value for customers by using additional active transistors productively, and decap and dummy transistors don’t really contribute to those goals.