By: Paul DeMone (pdemone.delete@this.igs.net), July 10, 2006 3:26 pm
Room: Moderated Discussions
Tzvetan Mikov (tmikov@gmail.com) on 7/10/06 wrote:
---------------------------
>The table on page 2 is very interesting for me. The Pentium Pro had fewer transistors
>than the 21164, slower clock rate and yet it had a competitive specfp and higher specint (!!).
>
>What is the accepted explanation for this ? I figure it is the OoO which obviously favors integer code.
In the same process feature size (0.35 um) the in-order
21164 easily outperformed PPro on integer benchmarks:
EV56:
433 MHz - 13.3 SPECint95
500 MHz - 15.4 "
600 MHz - 19.3 "
PPro/512:
166 MHz - 7.28 "
200 MHz - 8.71 "
>
>However: Why did a OoO implementation of the x86 ISA need fewer transistors than
>an in-order implementation of Alpha ? Am I misreading the numbers (e.g. most of
>the transistors in teh Alpha could have been in the L3 cache, etc) ?
The 21164A had 9.8m transistors of which cache consumed
over 7m transistors.
The PPro needed 5.5m transistors of which cache (L1 only
on chip) consumed about 1m transistors.
That leaves <3m logic transistors in the EV5 core vs >4m
transistors for the P6 core.
---------------------------
>The table on page 2 is very interesting for me. The Pentium Pro had fewer transistors
>than the 21164, slower clock rate and yet it had a competitive specfp and higher specint (!!).
>
>What is the accepted explanation for this ? I figure it is the OoO which obviously favors integer code.
In the same process feature size (0.35 um) the in-order
21164 easily outperformed PPro on integer benchmarks:
EV56:
433 MHz - 13.3 SPECint95
500 MHz - 15.4 "
600 MHz - 19.3 "
PPro/512:
166 MHz - 7.28 "
200 MHz - 8.71 "
>
>However: Why did a OoO implementation of the x86 ISA need fewer transistors than
>an in-order implementation of Alpha ? Am I misreading the numbers (e.g. most of
>the transistors in teh Alpha could have been in the L3 cache, etc) ?
The 21164A had 9.8m transistors of which cache consumed
over 7m transistors.
The PPro needed 5.5m transistors of which cache (L1 only
on chip) consumed about 1m transistors.
That leaves <3m logic transistors in the EV5 core vs >4m
transistors for the P6 core.