Moore’s Law has always been about scaling transistor density and count with advances in semiconductor process technology. As discussed in an earlier article from IEDM 2005, this is often conflated with higher performance and lower power consumption. Historically, the main driver for power consumption was reducing the operating voltage. The dynamic switching power of a chip is proportional to the square of voltage, so a small 10% voltage reduction translates into 19% lower active power.
For nearly 30 years the supply voltage scaled down, reaching 1V around 2005. At that point, voltage scaling slowed to nearly a stand still. The bandgap for silicon is about 1eV, meaning that it takes around 1V for silicon to switch between acting as an insulator and a conductor (where electrons move). Modern transistors use a variety of materials beyond silicon, but if the supply voltage is too far below 1V, transistors perform poorly and the vulnerability to errors increases.
Voltage has modestly decreased since 2005 and stands at around 0.8-0.9V, with some mobile chips operating as low as 0.65V (e.g. Sandy Bridge and Ivy Bridge). This was accomplished using careful design techniques that address the many problems of low voltage operation. One key challenge that was highlighted at ISSCC for 22nm and beyond is variability. Modern chips incorporate billions of transistors, and purely statistical effects dictate that some are slower or faster than others. Yet semiconductor manufacturers must produce large volumes of chips that function correctly over many years and a wide range of conditions. Dynamic operating conditions are a problem as well and become worse at low voltage. Large spikes or drops in current (known as dI/dt) can easily cause errors, similar in nature to brown-outs that hit overloaded power grids. Similarly, bit flips caused by alpha particles can corrupt data stored in circuits and large memory arrays.
The point where a transistor begins to turn ‘on’ and start conducting a little current is described as the threshold voltage, which is around 0.2-0.3V in modern process technology (although transistors do not reach the full saturating current till around 1V). The essence of Intel’s NTV research is observing that the most energy efficient supply voltage is just slightly higher than the threshold. A supply voltage that is substantially higher than the threshold is a convenience, but one that increases frequency at the cost of power. The big challenge for NTV is variability and ensuring correct functionality and high yields.
The goal of NTV techniques is to enable extremely low supply voltages, by using circuits that are extremely robust; tolerating variability and resilient against errors. While this sounds quite challenging, it is eminently feasible. Consider the problem of dI/dt, where a sudden demand for current (e.g. a floating point unit that wakes up and starts executing) causes voltage to drop across the chip. Chips using NTV tend to use less power, meaning that the current consumed is much smaller. Moreover, NTV circuits at low voltage will run at lower frequencies, meaning that the supply has more time to stabilize before errors start to occur.
Modern chips are largely composed of digital logic and memory, along with portions of analog and I/O circuitry. At a high level, CPUs and GPUs resemble large islands of memory for storing data (caches) and logic for computation (cores). Even logic such as a floating point unit contains memory elements though, such as register files and latches to hold intermediate data values. But logic is largely dominated by combinatorial logic and wiring, rather than data storage elements.
Generally, memory is much more difficult to scale than logic. At lower operating voltages, structures like SRAM cells are less stable and become very vulnerable to errors reading or writing data. In contrast, the impact of low voltage on logic tends to be an increase in delay, which reduces frequency. While lower performance is undesirable, it is also far more tolerable than data corruption. To illustrate, mobile and server SoCs often have separate voltages for the CPU cores and the large caches. For example, Intel’s Medfield operates the L2 cache at 1.05V, while the CPU core can dip down to 0.7V.
Intel’s research on NTV relies on novel circuit techniques that enhance the robustness of both logic and memory. The results presented at ISSCC come from several papers. The first describes an entire 32nm Pentium core implemented using NTV. The second paper focuses on NTV techniques in a 22nm SIMD permute unit. A third paper on a 32nm variable precision floating point unit does not use NTV, but helps to understand potential applications. Based on our analysis of these papers, Near-Threshold Voltage computing techniques are most applicable to highly parallel workloads. Generally, NTV is an ideal fit for HPC workloads and works very well for graphics, but not general purpose CPUs.
Discuss (86 comments)