By: someone (someone.delete@this.somewhere.com), November 23, 2010 11:42 am
Room: Moderated Discussions
David Kanter (dkanter@realworldtech.com) on 11/23/10 wrote:
---------------------------
>
>I don't believe that OOOE is inherently less power efficient than InO. OOOE lets
>you overlap more cache misses, which can reduce the amount of time the process is stalled.
>
>DK
You are talking about one narrow aspect of architectural
performance figure of merit, not the full performance vs
design cost metric or computational power efficiency.
The OOOE vs IOE issue is highly workload dependent. For
a given issue width and frequency the performance gain
from OOOE vs in-order is as high as 30-50% for branchy
scalar code with highly unstructured memory accesses to
less than 5% for code dominated by predictable control
flow and memory accesses. That is for implementations
of non-EPIC ISAs at the same issue width and frequency.
Published research into the benefit of OOOE for EPIC ISAs
is limited and tends to focus on novel simplified dynamic
scheduling schemes that aren't full classic OOOE.
Of course OOOE is not free. It adds complexity and power
consumption (dynamic and static). The power/area cost is
20 to 40% depending on the issue width and degree of
OOOE aggressiveness (windows size, speculativity etc).
Given a fixed amount of resources (silicon area, Watts),
an in-order implementation can devote more transistors
and Watts to other CPU functionality, more cache and/or
higher clock frequency.
Is OOOE worth the cost for general purpose MPUs (i.e.
intended for a wide range of applications)? The answer
is yes for most high performance implementations of
CISC and RISC ISAs although the appearance of modern
in-order processors like Atom and Power6 suggests the
question isn't nearly as settled as some like to claim.
What about OOOE for implementations of EPIC ISAs? A
comparison of McKinley vs EV6 vs Power4 suggests that
what EPIC brings to the table combined with extra CPU
resources not going OOOE buys makes the question a
lot more debatable than with non-EPIC ISAs. My guess
is Fort Collins looked carefully at OOOE but stayed with
an in-order design for Poulson to maximize performance
within its die size and power budget.
---------------------------
>
>I don't believe that OOOE is inherently less power efficient than InO. OOOE lets
>you overlap more cache misses, which can reduce the amount of time the process is stalled.
>
>DK
You are talking about one narrow aspect of architectural
performance figure of merit, not the full performance vs
design cost metric or computational power efficiency.
The OOOE vs IOE issue is highly workload dependent. For
a given issue width and frequency the performance gain
from OOOE vs in-order is as high as 30-50% for branchy
scalar code with highly unstructured memory accesses to
less than 5% for code dominated by predictable control
flow and memory accesses. That is for implementations
of non-EPIC ISAs at the same issue width and frequency.
Published research into the benefit of OOOE for EPIC ISAs
is limited and tends to focus on novel simplified dynamic
scheduling schemes that aren't full classic OOOE.
Of course OOOE is not free. It adds complexity and power
consumption (dynamic and static). The power/area cost is
20 to 40% depending on the issue width and degree of
OOOE aggressiveness (windows size, speculativity etc).
Given a fixed amount of resources (silicon area, Watts),
an in-order implementation can devote more transistors
and Watts to other CPU functionality, more cache and/or
higher clock frequency.
Is OOOE worth the cost for general purpose MPUs (i.e.
intended for a wide range of applications)? The answer
is yes for most high performance implementations of
CISC and RISC ISAs although the appearance of modern
in-order processors like Atom and Power6 suggests the
question isn't nearly as settled as some like to claim.
What about OOOE for implementations of EPIC ISAs? A
comparison of McKinley vs EV6 vs Power4 suggests that
what EPIC brings to the table combined with extra CPU
resources not going OOOE buys makes the question a
lot more debatable than with non-EPIC ISAs. My guess
is Fort Collins looked carefully at OOOE but stayed with
an in-order design for Poulson to maximize performance
within its die size and power budget.