Separate the OoO-ness from speculative-ness

Article: PhysX87: Software Deficiency
By: ? (0xe2.0x9a.0x9b.delete@this.gmail.com), July 21, 2010 7:05 am
Room: Moderated Discussions
anon (anon@anon.com) on 7/21/10 wrote:
---------------------------
>? (0xe2.0x9a.0x9b@gmail.com) on 7/21/10 wrote:
>---------------------------
>>slacker (s@lack.er) on 7/20/10 wrote:
>>---------------------------
>>The term "branch prediction" seems too constrained for me in this context. To generalize,
>>it is about the assumptions a pipelined CPU makes about the address of the next
>>instruction to be executed. Those assumptions can go wrong in pathological cases.
>>And these assumptions are present there even if the code contains *no* branch instructions
>>at all. Technically, a CPU does not need any branch instructions in order to be
>>a universal Turing machine, it only needs instructions for writing to memory from
>>which the CPU reads the code. A jump instruction is in fact a highly specialized memory write instruction.
>>
>>It is pure speculation for an x86 CPU to think that "if the address of the current
>>(non-branch) memory write instruction is ADDR, then the address of the next instruction
>>will be ADDR+1".
>
>I think I can see what you are getting at, but you are wrong too. The address of
>the next instruction definitely will be ADDR+1, but its contents may change.

Yes, I meant the contents of the address ADDR+1. The correct sentence should have been that pipelining makes the assumption: "If ADDR contains a memory write instruction I0 and ADDR+1 contains the value I1, then the instruction executed after I0 will be I1".

>But you're wrong about this as an example of how pipelining requires speculation
>because CPUs disallow such sequences without cache and pipeline draining instructions.

There are two options of how to design a pipelined version of a CPU:

1. Maintain full compatibility between the pipelined and non-pipelined versions, even for all the unlikely corner-cases such as intense self-modifying code. The two CPUs give mathematically identical results in all possible cases.

2. Do not maintain full compatibility. One specimen of this approach are the transitions from 8086 to 286 to 386 to 486 etc.

I was mainly talking about case (1).

From a historical perspective, highly pipelined x86 CPUs appeared after the non-pipelined ones. The idea you are advocating here is in contradiction with this, because the instruction set of the older CPUs had (=past) no idea about pipelining. Of course, now (=present) that we can see "the bigger picture", anyone going to design a non-pipelined CPU will be wise to add some pipeline-management instructions to the instruction set, just in case if there will be a pipelined version. But in case of old CPU designs and old codes, you cannot travel back in time to 1970-ties.

>Exceptions are another one, but again it could have imprecise exceptions and not require speculative execution.

Well, but you cannot go back to 1970 and fix 8086 so that it knows about the concept of imprecise exceptions.

>> Writes to registers are OK from this point of view, since the CPU
>>never fetches an instruction from there. In CPUs which are able to do data speculation,
>>even writing a register might cause partial pipeline stalls. (I don't know why I
>>am writing this here, because it seems obvious.)
>>
>>If you think pipelining in a universal-computation CPU has nothing to do with speculation,
>>you are simply wrong.
>
>On the contrary, I think your assertion that pipelining requires speculative execution is wrong.

I am not saying that. I am saying this: Pipelining, the way it is implemented in current CPUs (e.g: Core2), involves certain assumptions about what patterns will there be when the code is actually executed. Those assumptions are made without any prior attempts to check whether the code actually matches those patterns. The conversation between the CPU and the code looks like:

Code: Hi there, CPU, my friend. I want you to execute me.

CPU: No problem. Hand me the first couple of your instructions.

Code: Only a couple of them? You mean you don't want to see all of them?

CPU: That's right.

Code: But but, then, you aren't going to know what I am going to do.

CPU: Don't worry. I will manage.

Code: How?

CPU: Look pal, I know your kind. You will tell me to execute a bunch of moderately long sequences of adjacent instructions. Plus some branches here and there.

Code: Maybe, but what if you are wrong about me? Your assumptions seem to me like some kind of speculation about my nature.

CPU: Don't be so shy and hand me the initial couple of instructions!

Code: What if I don't ...

CPU: You don't have a choice ...

>> On the other hand, non-speculative pipelining *is* possible,
>>but only if the CPU is able to mathematically prove that a particular piece of code
>>is never violating any assumptions made by the pipelined architecture. But how many
>>existing CPUs are able to do such proofs?
>
>Anecdotal evidence does not make your claim correct.
>
>>Similarly, L1/L2 caches without any traces of speculative-ness whatsoever are also
>>possible - provided the CPU is able to actually prove that the memory access patterns
>>in a particular piece of code are fully known in advance. But how many existing
>>CPUs are able to do such proofs? (Considering the design of the x86 ISA, I cannot
>say I blame them for this inability.)
>
>Caches have nothing to do with speculative execution that you were talking about.

What are you saying? That if a contemporary CPU (Core2 or whatever) decides to allocate a cache-line for data at address 0x1230, it is not making any speculations about future uses of that piece of data?

(Note: I am *not* against caches)
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
A bit off baseJohn Mann2010/07/07 07:04 AM
  A bit off baseDavid Kanter2010/07/07 11:28 AM
    SSE vs x87Joel Hruska2010/07/07 12:53 PM
      SSE vs x87Michael S2010/07/07 01:07 PM
        SSE vs x87hobold2010/07/08 05:12 AM
      SSE vs x87David Kanter2010/07/07 02:55 PM
        SSE vs x87Andi Kleen2010/07/08 02:43 AM
          80 bit FPRicardo B2010/07/08 07:35 AM
            80 bit FPDavid Kanter2010/07/08 11:14 AM
              80 bit FPKevin G2010/07/08 02:12 PM
                80 bit FPIan Ollmann2010/07/19 12:49 AM
                  80 bit FPDavid Kanter2010/07/19 11:33 AM
                    80 bit FPAnil Maliyekkel2010/07/19 04:49 PM
                      80 bit FPrwessel2010/07/19 05:41 PM
                    80 bit FPMatt Waldhauer2010/07/21 11:11 AM
            80 bit FPEmil Briggs2010/07/22 09:06 AM
    A bit off baseJohn Mann2010/07/08 11:06 AM
      A bit off baseDavid Kanter2010/07/08 11:27 AM
        A bit off baseIan Ameline2010/07/09 10:10 AM
          A bit off baseMichael S2010/07/10 02:13 PM
            A bit off baseIan Ameline2010/07/11 07:51 AM
  A bit off baseDavid Kanter2010/07/07 09:46 PM
    A bit off baseAnon2010/07/08 12:47 AM
      A bit off baseanon2010/07/08 02:15 AM
        A bit off baseGabriele Svelto2010/07/08 04:11 AM
          Physics engine historyPeter Clare2010/07/08 04:49 AM
            Physics engine historyNull Pointer Exception2010/07/08 06:07 AM
              Physics engine historyRalf2010/07/08 03:09 PM
                Physics engine historyDavid Kanter2010/07/08 04:16 PM
                  Physics engine historysJ2010/07/08 11:36 PM
                    Physics engine historyGabriele Svelto2010/07/09 12:59 AM
                      Physics engine historysJ2010/07/13 06:35 AM
                    Physics engine historyDavid Kanter2010/07/09 09:25 AM
                      Physics engine historysJ2010/07/13 06:49 AM
                      Physics engine historyfvdbergh2010/07/13 07:27 AM
    A bit off baseJohn Mann2010/07/08 11:11 AM
      A bit off baseDavid Kanter2010/07/08 11:31 AM
        150 GFLOP/s measured?anon2010/07/08 07:10 PM
          150 GFLOP/s measured?David Kanter2010/07/08 07:53 PM
            150 GFLOP/s measured?Aaron Spink2010/07/08 09:05 PM
              150 GFLOP/s measured?anon2010/07/08 09:31 PM
                150 GFLOP/s measured?Aaron Spink2010/07/08 10:43 PM
                  150 GFLOP/s measured?David Kanter2010/07/08 11:27 PM
                    150 GFLOP/s measured?Ian Ollmann2010/07/19 01:14 AM
                      150 GFLOP/s measured?anon2010/07/19 06:39 AM
                        150 GFLOP/s measured?hobold2010/07/19 07:26 AM
                          Philosophy for achieving peakDavid Kanter2010/07/19 11:49 AM
                      150 GFLOP/s measured?Linus Torvalds2010/07/19 07:36 AM
                        150 GFLOP/s measured?Richard Cownie2010/07/19 08:42 AM
                          150 GFLOP/s measured?Aaron Spink2010/07/19 08:56 AM
                            150 GFLOP/s measured?hobold2010/07/19 09:30 AM
                              150 GFLOP/s measured?Groo2010/07/19 02:31 PM
                                150 GFLOP/s measured?hobold2010/07/19 04:17 PM
                                  150 GFLOP/s measured?Groo2010/07/19 06:18 PM
                              150 GFLOP/s measured?Anon2010/07/19 06:18 PM
                            150 GFLOP/s measured?Mark Roulo2010/07/19 11:47 AM
                              150 GFLOP/s measured?slacker2010/07/19 12:55 PM
                                150 GFLOP/s measured?Mark Roulo2010/07/19 01:00 PM
                              150 GFLOP/s measured?anonymous422010/07/25 12:31 PM
                            150 GFLOP/s measured?Richard Cownie2010/07/19 12:41 PM
                              150 GFLOP/s measured?Linus Torvalds2010/07/19 02:57 PM
                                150 GFLOP/s measured?Richard Cownie2010/07/19 04:10 PM
                                150 GFLOP/s measured?Richard Cownie2010/07/19 04:10 PM
                                  150 GFLOP/s measured?hobold2010/07/19 04:25 PM
                                  150 GFLOP/s measured?Linus Torvalds2010/07/19 04:31 PM
                                    150 GFLOP/s measured?Richard Cownie2010/07/20 06:04 AM
                                150 GFLOP/s measured?jrl2010/07/20 01:18 AM
                            150 GFLOP/s measured?anonymous422010/07/25 12:00 PM
                              150 GFLOP/s measured?David Kanter2010/07/25 12:52 PM
                          150 GFLOP/s measured?Anon2010/07/19 06:15 PM
                            150 GFLOP/s measured?Linus Torvalds2010/07/19 07:27 PM
                              150 GFLOP/s measured?Anon2010/07/19 09:54 PM
                                150 GFLOP/s measured?anon2010/07/19 11:45 PM
                        150 GFLOP/s measured?hobold2010/07/19 09:14 AM
                          150 GFLOP/s measured?Linus Torvalds2010/07/19 11:56 AM
                            150 GFLOP/s measured?a reader2010/07/21 08:16 PM
                              150 GFLOP/s measured?Linus Torvalds2010/07/21 09:05 PM
                                150 GFLOP/s measured?anon2010/07/22 02:09 AM
                                  150 GFLOP/s measured?a reader2010/07/22 07:53 PM
                                    150 GFLOP/s measured?gallier22010/07/23 05:58 AM
                                      150 GFLOP/s measured?a reader2010/07/25 08:35 AM
                                        150 GFLOP/s measured?David Kanter2010/07/25 11:49 AM
                                          150 GFLOP/s measured?a reader2010/07/26 07:03 PM
                                            150 GFLOP/s measured?Michael S2010/07/28 01:38 AM
                                              150 GFLOP/s measured?Gabriele Svelto2010/07/28 01:44 AM
                                    150 GFLOP/s measured?anon2010/07/23 04:55 PM
                                      150 GFLOP/s measured?slacker2010/07/24 12:48 AM
                                        150 GFLOP/s measured?anon2010/07/24 02:36 AM
                                    150 GFLOP/s measured?Vincent Diepeveen2010/07/27 05:37 PM
                                      150 GFLOP/s measured??2010/07/27 11:42 PM
                                        150 GFLOP/s measured?slacker2010/07/28 05:55 AM
                                      Intel's clock rate projectionsAM2010/07/28 02:03 AM
                                        nostalgia ain't what it used to besomeone2010/07/28 05:38 AM
                                          Intel's clock rate projectionsAM2010/07/28 10:12 PM
                        Separate the OoO-ness from speculative-ness?2010/07/20 07:19 AM
                          Separate the OoO-ness from speculative-nessMark Christiansen2010/07/20 02:26 PM
                          Separate the OoO-ness from speculative-nessslacker2010/07/20 06:04 PM
                            Separate the OoO-ness from speculative-nessMatt Sayler2010/07/20 06:10 PM
                              Separate the OoO-ness from speculative-nessslacker2010/07/20 09:37 PM
                                Separate the OoO-ness from speculative-ness?2010/07/20 11:51 PM
                                  Separate the OoO-ness from speculative-nessanon2010/07/21 02:16 AM
                                    Separate the OoO-ness from speculative-ness?2010/07/21 07:05 AM
                                      Software conventionsPaul A. Clayton2010/07/21 08:52 AM
                                        Software conventions?2010/07/22 05:43 AM
                                      SpeculationDavid Kanter2010/07/21 10:32 AM
                                        Pipelining affects the ISA?2010/07/22 10:58 PM
                                          Pipelining affects the ISA?2010/07/22 11:14 PM
                                          Pipelining affects the ISArwessel2010/07/23 12:03 AM
                                            Pipelining affects the ISA?2010/07/23 05:50 AM
                                            Pipelining affects the ISA?2010/07/23 06:10 AM
                                              Pipelining affects the ISAThiago Kurovski2010/07/23 02:59 PM
                                                Pipelining affects the ISAanon2010/07/24 07:35 AM
                                                  Pipelining affects the ISAThiago Kurovski2010/07/24 11:12 AM
                                          Pipelining affects the ISAGabriele Svelto2010/07/26 02:50 AM
                                            Pipelining affects the ISAIlleglWpns2010/07/26 05:14 AM
                                              Pipelining affects the ISAMichael S2010/07/26 03:33 PM
                                      Separate the OoO-ness from speculative-nessanon2010/07/21 05:53 PM
                                        Separate the OoO-ness from speculative-ness?2010/07/22 04:15 AM
                                          Separate the OoO-ness from speculative-nessanon2010/07/22 04:27 AM
                                      Separate the OoO-ness from speculative-nessslacker2010/07/21 07:45 PM
                                        Separate the OoO-ness from speculative-nessanon2010/07/22 01:57 AM
                                        Separate the OoO-ness from speculative-ness?2010/07/22 05:26 AM
                                          Separate the OoO-ness from speculative-nessDan Downs2010/07/22 08:14 AM
                                          Confusing and not very useful definitionDavid Kanter2010/07/22 12:41 PM
                                            Confusing and not very useful definition?2010/07/22 10:58 PM
                                              Confusing and not very useful definitionUngo2010/07/24 12:06 PM
                                                Confusing and not very useful definition?2010/07/25 10:23 PM
                            Separate the OoO-ness from speculative-nesssomeone2010/07/20 08:02 PM
                              Separate the OoO-ness from speculative-nessThiago Kurovski2010/07/21 04:13 PM
            You are just quoting SINGLE precision flops? OMG what planet do you live? Vincent Diepeveen2010/07/19 10:26 AM
              The prior poster was talking about SP (NT)David Kanter2010/07/19 11:34 AM
                All FFT's need double precisionVincent Diepeveen2010/07/19 02:02 PM
                  All FFT's need double precisionDavid Kanter2010/07/19 02:09 PM
                    All FFT's need double precisionVincent Diepeveen2010/07/19 04:06 PM
                  All FFT's need double precision - notMichael S2010/07/20 01:16 AM
                    All FFT's need double precision - notUngo2010/07/21 12:04 AM
                      All FFT's need double precision - notMichael S2010/07/21 02:35 PM
                      All FFT's need double precision - notEduardoS2010/07/21 02:52 PM
                        All FFT's need double precision - notAnon2010/07/21 05:23 PM
                          All FFT's need double precision - notRicardo B2010/07/26 07:46 AM
                        I'm on a boat!anon2010/07/22 11:42 AM
                        All FFT's need double precision - notVincent Diepeveen2010/07/24 11:39 PM
                          All FFT's need double precision - notslacker2010/07/25 03:27 AM
                            All FFT's need double precision - notRicardo B2010/07/26 07:40 AM
                          All FFT's need double precision - notEduardoS2010/07/25 08:37 AM
                            All FFT's need double precision - notMichael S2010/07/25 10:43 AM
                    All FFT's need double precision - notVincent Diepeveen2010/07/24 11:19 PM
      A bit off baseEduardoS2010/07/08 04:08 PM
        A bit off baseGroo2010/07/08 06:11 PM
          A bit off basejohn mann2010/07/08 06:58 PM
            All right...let's cool it...David Kanter2010/07/08 07:54 PM
    A bit off baseVincent Diepeveen2010/07/19 03:36 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?