The Silicon Factor
Why does Elbrus claim such superlative performance and physical attributes for their ‘paper’ chip? Is it a cynical and calculated effort to stand out from the crowd to attract funding? A simpler and less presumptuous explanation is the ‘silicon factor’. Dr. Babaian and his team at Elbrus have developed their craft designing the architecture and implementation of cold-war era supercomputers. The E2k itself is an outgrowth of the work on the Elbrus-3 supercomputer, an intriguing monster of a processor consuming 15 million transistors subdivided among about 3000 separate logic chips.
There is little doubt that Dr. Babaian and his engineers have made many noteworthy contributions to innovation in computer architecture, as demonstrated by their growing list of patents assigned to Elbrus International Ltd. In all likelihood there was a strong Russian influence to many new Western computer architecture developments since the early 1980’s, including HP’s work on EPIC leading to IA-64, Transmeta’s software and hardware scheme for execution of x86 binaries on a specialized VLIW processor, and Sun’s MAJC processor with its ‘space time computing concept’ for speculative execution. But the Elbrus team’s direct experience with monolithic processor implementations in deep submicron CMOS process technologies is virtually non-existent.
The engineering techniques and methodologies involved in designing complex systems in a multi-million transistor CMOS integrated circuit are vastly different from designing complex systems using thousand of discrete chips on multiple circuit boards. In board based design logic, gates are considered expensive and part counts must be minimized, while even irregular wiring is relatively cheap. In chip design the exact opposite is true. Most large scale ICs, like MPUs, are limited in size and complexity by the very difficult problem of arranging transistors so as to organize thousands of global nets into manageable routing channels. Chip designers prize physical regularity, low fan-in/fan-out, and locality of dataflow, and will often duplicate logic and functionality in a manner that might seem profligate to a board designer, when necessary to get it. Routing irregularities and design practices that are easily handled in board level design can create horrendous headaches in silicon.
Specific features in computer instruction set architectures can affect their ease of implementation as monolithic integrated circuits. Computer architects that aren’t intimately aware of how their decisions map to silicon and the true costs of different functional components often make poor choices that negatively affect the performance, cost, and competitiveness of their designs when put into the form of a chip. For example, the use of a unified register file has architectural appeal because it allows the compiler complete flexibility in allocating the available registers between integer, address, and FP computations. In some circumstances it is conceivable that this could either save unnecessary register spills to memory, or permit a greater degree of software unrolling of a the code within a computationally intensive loop. To an architect this provides an potential opportunity to reduce the path count (i.e. number of instructions executed to complete a program) and increase the opportunity for instruction level parallelism (ILP). Unfortunately the use of a unified register file puts tremendous upward pressure on the number of read and write ports that have to be physically implemented in silicon. Another nice property of split integer and FP register files that goes away with unified registers is that the separate files can be located closer to their respective integer and FP execution units and thus minimize bus wire lengths.
Be the first to discuss this article!