ARM Stacks the DEC against Competitors
In early 1995, Advanced RISC Machines, Apple Computer, and Digital Equipment Corporation (DEC) announced that they would develop a family of ultra high performance, low power processors for PDAs based on the ARM architecture. DEC was able to bring to bear many of their experienced MPU designers who helped create multiple generations of microVAX and Alpha processors. A year later the StrongARM SA-110 appeared and set a new standard for embedded control processors.
Once described as a processor successfully hiding in an SRAM, the 0.35 um SA-110 packed 2.5 million transistors (115,000 in the CPU) in a 50 mm2 die with 16 KB, 32-way set associative instruction and data caches, MMUs, and 128 byte write buffer. The floorplan of the SA-110 is shown in Figure 3.
Figure 3 Floorplan of StrongARM SA-110
It could operate up to 200 MHz while typically consuming less than a Watt of power . Very rarely does a new MPU dominate its competitors in both high performance and low power in the fashion of the StrongARM. This technical dominance can be seen in the performance versus power graph in Figure 4.
Figure 4 StrongARM Greets The Competition, 1996
The StrongARM SA-110 was a great leap ahead in both clock rate and performance for several reasons. It was manufactured in the same high performance 0.35 um process as the Alpha EV56 and EV6. The SA-110 also extended the execution pipeline from three stages, used in earlier ARM implementations, to five stages. It was implemented with mostly static circuits such as complementary gates and differential logic. Pseudostatic circuit techniques using weak feedback or self-timed circuits were employed in certain high fan-in critical paths to achieve high performance. The SA-110 designers also benefited from the use of the DEC’s proprietary IC CAD tools created for Alpha processor development, as well as many of the accompanying design methodologies.
Low power dissipation was achieved by minimizing the CV2f component through reduced supply voltage, extensive use of clock gating, replacement of transparent latches by edge-triggered flip-flops, and careful logic design. For example, in the ARM instruction set, one operand to an ALU operation can be shifted by an arbitrary amount, but in most cases this option isn’t used. The StrongARM datapath included logic to check for a zero shift amount, in which case the shifter is disabled and bypassed to reduce power. Power due to transistor leakage was minimized by increasing the gate length of transistors by 12% more than minimum length in slower sections of the processor. This had the effect of reducing leakage current by up to 50%.
Discuss (2 comments)