OOO hw vs SW&in-order hw

By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), September 29, 2009 6:44 am
Room: Moderated Discussions
Anders Jensen (@.) on 9/29/09 wrote:
>
>Putting instructions in parallel however is probably
>cheaper in software and extremely expensive in hardware,
>sure you will miss out on the last 10% of optimization
>or so, but that is the healthy sign you should look for
>in complex optimization.

You're overcomplicating the thing.

You don't even need any complex parallel decoding (which
really is quite complex for something like x86): OoO can
be done (and has been done) on a much simpler scale.

For example, you might be much better off with a single
instruction per cycle decoder tied into an OoO core, than
with a two-instruction paidable "UV pipe" like Intel has
in the Pentium (and Atom?) that has some fairly strict
pairability rules.

Of course, I wouldn't actually expect Intel to ever do
anything like that. A more likely situation is the kind
of traditional Intel decoder, which does one complex
instruction per cycle, but can do several ones if they
are simple.

And whatever you do at the front-end, you certainly don't
need to do any complex and "extremely expensive"
parallelism anywhere else. Tomasulo is neither very complex
nor extremely expensive.

>Still I'm not sure it is worth doing Tomasulo just to get
>this last part. Doing multipass pipelining will probably
>get you some gains that runahead leaves behind in this
>respect, but it will definitively cost you some
>throughput/watt.

I doubt that a superscalar highly pipelined approach is
all that much more simpler than Tomasulo with just a
reservation station per unit. I also think your "corner
cases" are oddly chosen:

>Complex problems never has simple solutions and I have yet
>to find a complex question where the best solution is a
>corner case. For CPU uarch typical corner cases would be
>in-order and Out-of-order.

That's just total bullsh*t.

The corner cases aren't "in-order" vs "out-of-order" at all.

There's a lot of details you are skipping, and the
complexity of in-order easily overlaps with the complexity
of out-of-order when you start looking at those details.

What you trivially just call in-order is a whole spectrum
of possible complexities, with ranges of pipeline depth
(and the inevitable forwarding) and super-scalar. Add to
that blocking vs nonblocking cache accesses etc etc.

Similarly, what you just dismiss as the "corner case" of
being OoO is not a corner case at all, but another whole
spectrum of implementations, ranging from some fairly
simple Tomasulo with just one reservation per unit to having
some rather extreme instruction window depths of tens (or
hundreds) of instructions etc.

And then you have the whole "speculative execution" which
you can do in both cases, and the question is just how
far you push it (in OoO you can push it much further - but
you don't have to).

So while I agree with you that the solution is never the
extremes ("corner cases") I fundamentally think you then
totally went off the deep end by saying that "in-order" vs
"out-of-order" are some kind of corner cases. You can't
make that kind of insane simplifications.

And the thing is, in-order hits a huge complexity and
performance wall. I can't recall anybody ever having done
more than two-way superscalar, even on things that are
much easier to decode than x86. There's just not enough of
an upside to the complexity (you'd do WLIV to avoid the
complexity, but that has its own downsides, both in future
designs and in I$ costs).

I also suspect that things like SMT (which Atom supports)
are a whole lot more natural in an OoO environment. The
pipeline just isn't as rigid. So you can't just compare
some unnamed in-order implementation with some other OoO
one and say that the in-order one is simpler - you have to
state what the performance requirements are.

Is a single-scalar in-order CPU without HT much simpler than
even the simplest OoO core? Oh, sure. Nobody will claim
otherwise. But try to make it perform better, and you'll
start seeing huge complexities - to the point where you
simply have to either say "no more performance", or you'd
create a monster that is much more complex than the
equivalently performing out-of-order implementation.

See? Nobody makes those insane in-order ones, because at
some point it just becomes much simpler to do out-of-order,
and get better performance much more naturally than by
trying to push the in-order thing.

Sure, you can push in-order. You can try to push it with
run-ahead threads, you can do SMT, you can do a lot of
those things. But in the end, at some point you'll either
find that your chip is more complex than a simple OoO
implementation would have been at equivalent performance,
or you'll just say "I'll stop here, and suck".

Calling out-of-order some "corner case" is ludicrous.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Thoughts and questions on the Cortex A9Gabriele Svelto2009/09/26 01:46 AM
  Thoughts and questions on the Cortex A9none2009/09/26 02:27 AM
    Thoughts and questions on the Cortex A9jeff2009/09/27 04:06 AM
      Thoughts and questions on the Cortex A9Michael S2009/09/27 04:29 AM
        Thoughts and questions on the Cortex A9none2009/09/27 05:01 AM
          Thoughts and questions on the Cortex A9Howard Chu2009/09/27 09:39 AM
      Thoughts and questions on the Cortex A9Wilco2009/09/27 06:03 AM
        Thoughts and questions on the Cortex A9jeff2009/09/27 07:00 AM
          Thoughts and questions on the Cortex A9a reader2009/09/27 07:17 AM
            Thoughts and questions on the Cortex A9David Kanter2009/09/27 07:37 AM
              Thoughts and questions on the Cortex A9a reader2009/09/27 07:46 AM
                Thoughts and questions on the Cortex A9Mat2009/10/01 12:04 PM
                  Thoughts and questions on the Cortex A9Wilco2009/10/01 05:09 PM
                    Thoughts and questions on the Cortex A9anon2009/10/01 07:19 PM
            Thoughts and questions on the Cortex A9RagingDragon2009/09/28 04:11 PM
          Thoughts and questions on the Cortex A9Linus Torvalds2009/09/27 08:05 AM
            OOO hw vs SW&in-order hwno thanks2009/09/27 03:47 PM
              OOO hw vs SW&in-order hwLinus Torvalds2009/09/28 05:22 AM
                OOO hw vs SW&in-order hw?2009/09/28 10:37 AM
                  OOO hw vs SW&in-order hwRagingDragon2009/09/28 04:22 PM
                  OOO hw vs SW&in-order hwMegol2009/09/29 03:35 AM
                OOO hw vs SW&in-order hwAnders Jensen2009/09/28 10:50 PM
                  OOO hw vs SW&in-order hwLinus Torvalds2009/09/29 06:44 AM
                    OOO hw vs SW&in-order hwMark Roulo2009/09/29 08:58 AM
                      OOO hw vs SW&in-order hwLinus Torvalds2009/09/29 09:30 AM
                        3- and 4-issue in-order CPUsMark Roulo2009/09/29 10:06 AM
                          3- and 4-issue in-order CPUsLinus Torvalds2009/09/29 10:29 AM
                          3- and 4-issue in-order CPUsGian-Carlo Pascutto2009/09/29 11:35 PM
                          3- and 4-issue in-order CPUsMichael S2009/09/30 01:01 AM
                    OOO hw vs SW&in-order hwmpx2009/09/30 03:14 AM
                    OOO hw vs SW&in-order hwPun Zu2009/10/02 01:44 AM
                      OOO hw vs SW&in-order hwnone2009/10/02 04:22 AM
                      OOO hw vs SW&in-order hwLinus Torvalds2009/10/02 06:11 AM
                        OOO hw vs SW&in-order hwa reader2009/10/02 08:30 AM
                          OOO hw vs SW&in-order hwLinus Torvalds2009/10/02 08:59 AM
                            MoorestownDavid Kanter2009/10/02 09:59 AM
                              What's the difference between Moorestown and Pine Trail cores?anon2009/10/03 07:37 PM
                              Moorestownnone2009/11/03 03:34 PM
                                MoorestownAnon2009/11/04 02:17 PM
                                  Moorestownnone2009/11/05 12:38 AM
                                    MoorestownDavid Kanter2009/11/05 03:45 PM
                                      MoorestownIntelUser20002009/11/06 03:17 AM
                                      MoorestownAnon2009/11/06 12:51 PM
                                        Moorestownnone2009/11/07 06:07 AM
                            OOO hw vs SW&in-order hwAnon2009/10/02 06:55 PM
                              Cluebat for graphicsDavid Kanter2009/10/02 08:19 PM
                                Cluebat for graphicsAnon2009/10/03 04:45 PM
                                  Cluebat for graphicsDavid Kanter2009/10/04 12:57 AM
                                    Cluebat for graphicsAnon2009/10/04 07:15 PM
                                      Cluebat for graphicsDavid Kanter2009/10/05 02:09 AM
                                        Cluebat for graphicsAnon2009/10/05 02:36 PM
                                          Cluebat for graphicsDavid Kanter2009/10/05 08:54 PM
                                            Cluebat for graphicsAnon2009/10/06 04:58 PM
                              OOO hw vs SW&in-order hwLinus Torvalds2009/10/03 05:58 AM
                            OOO hw vs SW&in-order hwslacker2009/10/02 08:11 PM
                            Linux graphics driversRagingDragon2009/10/03 07:27 PM
                              Linux graphics driversanon2009/10/04 06:15 AM
                                Linux graphics driversnone2009/10/04 09:12 AM
            Thoughts and questions on the Cortex A9jeff2009/09/27 05:31 PM
        Thoughts and questions on the Cortex A9someone2009/09/27 08:30 AM
          Thoughts and questions on the Cortex A9none2009/09/27 09:09 AM
            Thoughts and questions on the Cortex A9Wilco2009/09/27 10:35 AM
              Thoughts and questions on the Cortex A9someone2009/09/27 10:55 AM
                Thoughts and questions on the Cortex A9Wilco2009/09/28 01:08 AM
                  Thoughts and questions on the Cortex A9someone2009/09/28 04:58 AM
                    Thoughts and questions on the Cortex A9none2009/09/28 05:18 AM
                      Thoughts and questions on the Cortex A9someone2009/09/28 06:35 AM
                    Thoughts and questions on the Cortex A9Wilco2009/09/28 07:25 AM
                      Thoughts and questions on the Cortex A9Michael S2009/09/28 10:02 AM
                        Thoughts and questions on the Cortex A9Wilco2009/09/29 12:35 AM
                    Thoughts and questions on the Cortex A9Chuck2009/09/28 06:15 PM
              samplesAM2009/09/27 10:20 PM
                samplesWilco2009/09/28 12:51 AM
                  samplesAM2009/09/28 03:16 AM
              Shrinks and process techDavid Kanter2009/09/29 12:22 AM
            Thoughts and questions on the Cortex A9someone2009/09/27 10:42 AM
              Thoughts and questions on the Cortex A9none2009/09/27 11:52 AM
              Atom to stay in-oder or go OoO?AM2009/09/27 10:09 PM
                Atom to stay in-oder or go OoO?Ungo2009/09/28 04:34 AM
                  Atom to stay in-oder or go OoO?a reader2009/09/28 09:15 AM
                    Atom to stay in-oder or go OoO?anon2009/09/28 06:25 PM
                  Atom to stay in-oder or go OoO?AM2009/09/30 02:32 AM
                    Atom to stay in-oder or go OoO?baxeel2009/09/30 07:25 AM
                      Atom to stay in-oder or go OoO?AM2009/09/30 10:12 PM
                    Atom to stay in-oder or go OoO?Ungo2009/10/01 02:00 AM
                      Atom to stay in-oder or go OoO?AM2009/10/01 04:08 AM
                        Atom to stay in-oder or go OoO?anonymous2009/10/01 04:33 AM
                          Atom to stay in-oder or go OoO?AM2009/10/03 06:24 AM
                        Atom to stay in-oder or go OoO?Pun Zu2009/10/02 12:30 AM
                        Atom to stay in-oder or go OoO?Ungo2009/10/02 12:11 PM
                          Atom to stay in-oder or go OoO?AM2009/10/03 06:22 AM
                            Atom to stay in-oder or go OoO?Ungo2009/10/03 01:53 PM
                              Atom to stay in-oder or go OoO?AM2009/10/04 07:44 AM
                                Atom to stay in-oder or go OoO?David Kanter2009/10/04 10:02 PM
                                  Atom to stay in-oder or go OoO?AM2009/10/05 06:18 AM
                                    Atom to stay in-oder or go OoO?David Kanter2009/10/05 10:12 AM
                                      Atom to stay in-oder or go OoO?AM2009/10/06 03:51 AM
                                        Atom to stay in-oder or go OoO?anonymous2009/10/06 06:58 AM
                                        Do you have any proof?David Kanter2009/10/06 08:58 AM
                                          Do you?AM2009/10/06 10:30 PM
                                            Of course I do!anonymous2009/10/07 04:58 AM
                                              Thanks :-)AM2009/10/08 02:17 AM
                                                Thanks :-)anonymous2009/10/08 04:52 AM
                                                  Thanks :-)AM2009/10/09 02:13 AM
                                                    Thanks :-)anonymous2009/10/09 05:03 AM
                                                    Thanks :-)Foo_2009/10/09 05:47 AM
                                                      Thanks :-)AM2009/10/10 12:15 AM
                                            That's what I thought...David Kanter2009/10/07 08:00 AM
                                              That's what I thought...AM2009/10/08 02:26 AM
                                                That's what I thought...anonymous2009/10/08 05:02 AM
                                                  let's see...AM2009/10/09 02:09 AM
                                                    let's see...anonymous2009/10/09 04:43 AM
                                                      let's see...AM2009/10/09 04:52 AM
                                                        let's see...anonymous2009/10/09 05:15 AM
                                                          let's see...AM2009/10/10 12:18 AM
                Atom to stay in-oder or go OoO?someone2009/09/28 05:09 AM
          I call Trollhobold2009/09/28 03:51 AM
            I call Trollsomeone2009/09/28 05:15 AM
              OT: categories of motivation in a forumhobold2009/09/29 05:01 AM
          Thoughts and questions on the Cortex A9Michael S2009/09/28 09:43 AM
            Thoughts and questions on the Cortex A9a reader2009/09/28 03:12 PM
              Thoughts and questions on the Cortex A9someone else2009/09/28 11:25 PM
                Why Cortex A9?hobold2009/09/29 06:20 AM
                  Why Cortex A9?someone else2009/09/29 09:57 AM
                    Why Cortex A9?Richard Cownie2009/09/29 05:09 PM
                      Why Cortex A9?hobold2009/09/29 11:38 PM
                        Why Cortex A9?Richard Cownie2009/09/30 05:49 AM
                          Why Cortex A9?hobold2009/09/30 06:46 AM
                            Why Cortex A9?none2009/09/30 06:56 AM
                              Marvell Sheeva and plug computingRichard Cownie2009/09/30 08:03 AM
                              Why Cortex A9?Michael S2009/09/30 09:07 AM
                                Why Cortex A9?none2009/09/30 09:40 AM
                                Why Cortex A9?Gabriele Svelto2009/09/30 11:43 AM
                                  ARM architectural licenseDavid Kanter2009/09/30 04:57 PM
                                    ARM architectural licensea reader2009/10/01 06:25 AM
                                      ARM architectural licenseRichard Cownie2009/10/01 07:21 AM
                                Why Cortex A9?slacker2009/09/30 06:12 PM
                                  ARM architectural licenseDavid Kanter2009/09/30 06:16 PM
                                  Why Cortex A9?Michael S2009/10/01 06:45 AM
                                    Why Cortex A9?slacker2009/10/02 01:41 AM
                                      Why Cortex A9?Richard Cownie2009/10/02 09:28 AM
                                        Questions...David Kanter2009/10/02 09:56 AM
                                          Questions...Richard Cownie2009/10/02 10:29 AM
                                            Questions...Wilco2009/10/02 12:05 PM
                                          Questions...slacker2009/10/02 07:51 PM
                                        Why Cortex A9?slacker2009/10/02 07:44 PM
                            Why Cortex A9?David W. Hess2009/09/30 07:42 AM
    Thoughts and questions on the Cortex A9Gabriele Svelto2009/09/28 12:28 AM
  Thoughts and questions on the Cortex A9Wilco2009/09/26 06:38 AM
    Thoughts and questions on the Cortex A9Gabriele Svelto2009/09/28 12:38 AM
      Thoughts and questions on the Cortex A9Costanza2009/10/01 02:45 PM
    Thoughts and questions on the Cortex A9sylt2009/09/28 04:54 AM
      Thoughts and questions on the Cortex A9Wilco2009/09/29 12:15 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?