The third unusual aspect of the i860 was the most intriguing, but also most difficult to use – the pipelined floating point and load and store instructions. The i860 FPU supported both scalar mode (conventional) floating point instructions and special pipelined instructions, which directly exposed the pipeline latencies within the i860’s floating point hardware to instruction scheduling. For example, the pipelined floating-point multiple instruction, pfmul.p src1, src2, dest would compute the product of the two operands found in the floating registers denoted by src1 and src2. But the result value stored in the floating point register denoted by dest would be the output of the 2nd or 3rd earlier pfmul.p instructions executed, depending on the precision. And the result of multiplying together the current src1 and src2 values will be deposited in the dest register specified in the 2nd or 3rd pfmul.p instructions executed in the future. This pipelining scheme is shown in Figure 2 for pipelined floating-point add instructions which have a latency of 3 clock cycles. Notice that 7 pipelined floating point add instructions are used to perform 4 additions; the last three add instructions are used to flush results out of the FP pipeline.
The pipelined floating point instructions permitted carefully written code to hide the latency of the floating point pipelines and achieve higher performance compared to conventional instructions found in competing RISC processors. However, support for pipelined instructions was non-existent in i860 compilers for years after it was introduced and the potential performance of the processor could only be exploited using hand written assembly language routines that were time-consuming and error prone to create and difficult to debug.
In the end the Intel i860 turned out to be a major flop for its intended market. The new performance enhancement features of the i860 such as dual instruction mode and pipelined floating point instructions proved very difficult for compilers to exploit and were largely irrelevant for the initial market push. The million transistor i860 in its fastest speed grade, 40 MHz, turned in similar Linpack MFLOP/s and SPEC89 performance as the approximately one hundred thousand transistor MIPS R3000A processor did at 33 MHz. Of the several reported system vendors readying i860-based systems, only Oki Electric Industry Co Ltd. actually released a workstation family, the Oki 7300 series. It had poor market acceptance due to average performance and lack of applications, and Oki quickly exited the workstation market.
Microsoft also initially used the i860 as a target for its Windows NT portable operating system. In fact, some people claim that the letters “NT”, widely understand as an acronym for “new technology”, actually originated from the i860 code name when it was under development, N10 (i.e. N Ten). Reportedly Microsoft programmers hated the i860 because it was awkward to program and debug due to its sparse instruction set and clumsy exception processing. Early in the development of Windows NT the i860 was dropped and the MIPS processor family was adopted instead as the secondary target architecture to complement the primary x86 CISC platform.
The i860 did ultimately have some measure of success in high-end graphics cards and floating point digital signal processing boards. Intel itself incorporated the i860 in their Paragon series of massively parallel supercomputers, although they eventually exited this market. Intel also developed second version of the i860, with 2.55 million transistors in a 0.8 um three level metal CMOS process that could run as fast as 50 MHz. The second version of the i860 was named the i860XP while the original became known as the i860XR.
Discuss (16 comments)