ISSCC 2005 Coverage: Day 1

Pages: 1 2

Forum 3 – High Performance Embedded DRAM and Structures for Test

John Barth – IBM Systems & Technology Group

For embedded purposes, SRAM is the de facto choice for discerning designers. Embedded SRAM provides the fastest cycle times while operating well in a semiconductor logic process. However, one bit of SRAM storage typically requires 6 transistors, whereas a DRAM cell only needs 1 transistor plus one capacitor. Hence, the common argument in favour of eDRAM is that of the 4x density advantage relative to eSRAM.

While not ignoring this point, the presenter saw the problem from another perspective. While it was conceded that eSRAM provides the fastest random access cycle times, eDRAM can come close, and the remaining performance differential between eSRAM and eDRAM can be mitigated through architectural choices if they are considered early enough in the design cycle.

The speaker went on to argue that most high-end designs are more oriented about the memory hierarchy than the logic circuits themselves. Further, he posed an example of where eDRAM may be superior to eSRAM in a conventional logic design. The floorplan for the Itanium2 9M processor was displayed, as can be seen in Figure 2. The furthest L3 subarray was estimated to be 23mm away from the cache controller in Intel’s layout. The floorplan for a hypothetical Itanium2 9M which used eDRAM for the L3 cache array was then shown (Figure 3). In this floorplan, the furthest subarray would only be, roughly, 14mm away from the cache controller. Delay approximations were made for the hypothesized array, and the results can be found in table 1 below.

Figure 2 – Comparison of Madison 9M implementing L3 with eSRAM (432mm2) and eDRAM (245mm2), not to scale

  L3 Tag L3 Cache Wire Delay Total
SRAM 5 cycles 5 cycles 10 cycles 20 cycles
eDRAM 5 cycles 10 cycles 6 cycles 21 cycles
Table 1 – Approximation of L3 cache delays in real and hypothetical Itanium 2 Microprocessors

Thus, while the actual eDRAM cells are slower than the corresponding eSRAM cells, the increased density of eDRAM leads to shorter wires in the L3 cache array. The reduction in worst-case wire length (23 to 14mm) corresponded to a 39% reduction in wire delay. It should be noted that the speaker emphasized that they took certain liberties when deriving these figures.

During the question session, it was asked what additional costs were involved with fabricating chips using eDRAM. It was stated that the eDRAM process adds 3 extra mask stages before any of the other logic process steps, and that the typical cost adder is on the order of 20%. Thus, there is a cross-over point between the additional cost of eDRAM processing and the increased density of eDRAM. Presently, this cross-over tends to exist around the 8-16Mb mark.

Forum 4 – DRAM Design for Mobile Applications

G. Alexander – Infineon Technologies

As mentioned in the High Speed DRAM I/O presentation, the standard DRAM market focuses on the primary goal of price/performance. This ratio acts as a major constraint on array architecture and packaging. That being said, the industry has shown acceptance of custom DRAM technology for specialized applications. This acceptance has manifested itself in the success of custom DRAM for graphics processors (e.g. GDDR3). The speaker posited that the mobile market is sufficiently large and price-insensitive enough to support a customized DRAM architecture.

In the case of graphics DRAM, the architecture emphasizes bandwidth. The needs of mobile applications are lower power requirements, compact packaging, and a very wide operational temperature range. To address each point, it’s essential to realize that lowering power is a key enabler for battery-operated devices. Additionally, the continuing trend of miniaturization for hand-held devices makes every square millimeter of board space valuable. Hence, BGA and TSOP packaging should be ruled-out in favour of chip-scale packages. Lastly, while PCs and servers tend to operate indoors, DRAMs for portable devices must be able to withstand a wider temperature range.

Two major architectural choices were posed as potential solutions for a mobile DRAM: a modified SDRAM, or a pseudo-SRAM (PSRAM). The key feature of PSRAM (a DRAM with an SRAM interface) is a refresh cycle which is transparant to the system designer. The speaker then went on to discuss circuit features which can be used to reduce power in SDRAM and PSRAM arrays, such as low-current self-refresh circuits, a deep power-down mode, turning the DLLs off, and reducing the voltage levels of input signals.

Forum 5 – Sub-1V DRAM Design

Takayuki Kawahara – Central Research Laboratory, Hitachi Ltd.

This presentation discussed several issues that a circuit designer should keep in mind when designing DRAMs in the sub-1V regime. First, it was noted that while certain parts of a memory array can and should scale to sub-1V supplies in the near future, other parts shouldn’t be scaled-down so rapidly. Examples of low-supply blocks would be the array decoders and periphery, whereas boosted-supply circuits include the line drivers, and the memory array itself. Due to the requirement for multiple voltage supplies, it was recommended that DRAM designers study charge pumps as a method to provide the voltages required in different parts of the memory circuit.

Next, the spectre of signal integrity was addressed. The speaker discussed how one of the best methods to maintain a high signal-to-noise ratio is to merely keep the signal voltages high. While four fundamental causes of signal voltage deterioration were identified (voltage drop, leakage current, array noise, and the sense amps), corresponding solutions and work-arounds are known for each source of voltage degradation.

Lastly, it was recommended that DRAM designers learn from SRAM and logic designs. It was shown that thin box FD-SOI and double-gate structures for transistors can be of great benefit to DRAM designs, as well. One benefit of FD-SOI is that V_T variations are, roughly, 1/5 that of a bulk silicon process. Double-gate structures allow for fine-tuning of drain-to-source currents via adaptive gate/back-gate biasing techniques. These candidate structures show significant promise for future DRAM implementations.

Forum 6 – DRAM in the Nanoscale Era

Tomoyuki Ishii – Central Research Laboratory, Hitachi Ltd.

There are intrinsic problems with scaling a traditional DRAM, a 1-transistor/1-capacitor (1T-1C) device, to the sub-100nm regime. The two fundamental problems go hand-in-hand; the first issue is maintaining at least ~20fF of capacitance per cell, and the second issue is minimizing leakage current through the pass-transistor in each cell. These problems must be solved, or refresh cycles will consume too much of the total timing budget in future DRAMs.

Rather than discuss all the potential ways of circumventing these problems, this presentation acted as a survey of memory technologies which may replace conventional DRAM in certain applications in the future. Perhaps the most interesting technology is magnetic RAM (MRAM). The storage device for MRAM is the magnetic tunnel junction. This junction is composed of a thin, insulating layer which sits between two ferromagnetic electrodes (one soft, and one hard magnet). Depending on the magnetic field applied to the MRAM device, the magnetizations in the two magnets will either become parallel or anti-parallel. These two, binary states have very distinct resistivities, and the effective device resistance can be sensed with ease. Given the non-volatility of MRAM, the short read and write cycles, and low operating power, it is a favoured technology for the future.

Pages: « Prev  1 2  

Be the first to discuss this article!