Article: PhysX87: Software Deficiency
By: ? (0xe2.0x9a.0x9b.delete@this.gmail.com), July 21, 2010 7:05 am
Room: Moderated Discussions
anon (anon@anon.com) on 7/21/10 wrote:
---------------------------
>? (0xe2.0x9a.0x9b@gmail.com) on 7/21/10 wrote:
>---------------------------
>>slacker (s@lack.er) on 7/20/10 wrote:
>>---------------------------
>>The term "branch prediction" seems too constrained for me in this context. To generalize,
>>it is about the assumptions a pipelined CPU makes about the address of the next
>>instruction to be executed. Those assumptions can go wrong in pathological cases.
>>And these assumptions are present there even if the code contains *no* branch instructions
>>at all. Technically, a CPU does not need any branch instructions in order to be
>>a universal Turing machine, it only needs instructions for writing to memory from
>>which the CPU reads the code. A jump instruction is in fact a highly specialized memory write instruction.
>>
>>It is pure speculation for an x86 CPU to think that "if the address of the current
>>(non-branch) memory write instruction is ADDR, then the address of the next instruction
>>will be ADDR+1".
>
>I think I can see what you are getting at, but you are wrong too. The address of
>the next instruction definitely will be ADDR+1, but its contents may change.
Yes, I meant the contents of the address ADDR+1. The correct sentence should have been that pipelining makes the assumption: "If ADDR contains a memory write instruction I0 and ADDR+1 contains the value I1, then the instruction executed after I0 will be I1".
>But you're wrong about this as an example of how pipelining requires speculation
>because CPUs disallow such sequences without cache and pipeline draining instructions.
There are two options of how to design a pipelined version of a CPU:
1. Maintain full compatibility between the pipelined and non-pipelined versions, even for all the unlikely corner-cases such as intense self-modifying code. The two CPUs give mathematically identical results in all possible cases.
2. Do not maintain full compatibility. One specimen of this approach are the transitions from 8086 to 286 to 386 to 486 etc.
I was mainly talking about case (1).
From a historical perspective, highly pipelined x86 CPUs appeared after the non-pipelined ones. The idea you are advocating here is in contradiction with this, because the instruction set of the older CPUs had (=past) no idea about pipelining. Of course, now (=present) that we can see "the bigger picture", anyone going to design a non-pipelined CPU will be wise to add some pipeline-management instructions to the instruction set, just in case if there will be a pipelined version. But in case of old CPU designs and old codes, you cannot travel back in time to 1970-ties.
>Exceptions are another one, but again it could have imprecise exceptions and not require speculative execution.
Well, but you cannot go back to 1970 and fix 8086 so that it knows about the concept of imprecise exceptions.
>> Writes to registers are OK from this point of view, since the CPU
>>never fetches an instruction from there. In CPUs which are able to do data speculation,
>>even writing a register might cause partial pipeline stalls. (I don't know why I
>>am writing this here, because it seems obvious.)
>>
>>If you think pipelining in a universal-computation CPU has nothing to do with speculation,
>>you are simply wrong.
>
>On the contrary, I think your assertion that pipelining requires speculative execution is wrong.
I am not saying that. I am saying this: Pipelining, the way it is implemented in current CPUs (e.g: Core2), involves certain assumptions about what patterns will there be when the code is actually executed. Those assumptions are made without any prior attempts to check whether the code actually matches those patterns. The conversation between the CPU and the code looks like:
Code: Hi there, CPU, my friend. I want you to execute me.
CPU: No problem. Hand me the first couple of your instructions.
Code: Only a couple of them? You mean you don't want to see all of them?
CPU: That's right.
Code: But but, then, you aren't going to know what I am going to do.
CPU: Don't worry. I will manage.
Code: How?
CPU: Look pal, I know your kind. You will tell me to execute a bunch of moderately long sequences of adjacent instructions. Plus some branches here and there.
Code: Maybe, but what if you are wrong about me? Your assumptions seem to me like some kind of speculation about my nature.
CPU: Don't be so shy and hand me the initial couple of instructions!
Code: What if I don't ...
CPU: You don't have a choice ...
>> On the other hand, non-speculative pipelining *is* possible,
>>but only if the CPU is able to mathematically prove that a particular piece of code
>>is never violating any assumptions made by the pipelined architecture. But how many
>>existing CPUs are able to do such proofs?
>
>Anecdotal evidence does not make your claim correct.
>
>>Similarly, L1/L2 caches without any traces of speculative-ness whatsoever are also
>>possible - provided the CPU is able to actually prove that the memory access patterns
>>in a particular piece of code are fully known in advance. But how many existing
>>CPUs are able to do such proofs? (Considering the design of the x86 ISA, I cannot
>say I blame them for this inability.)
>
>Caches have nothing to do with speculative execution that you were talking about.
What are you saying? That if a contemporary CPU (Core2 or whatever) decides to allocate a cache-line for data at address 0x1230, it is not making any speculations about future uses of that piece of data?
(Note: I am *not* against caches)
---------------------------
>? (0xe2.0x9a.0x9b@gmail.com) on 7/21/10 wrote:
>---------------------------
>>slacker (s@lack.er) on 7/20/10 wrote:
>>---------------------------
>>The term "branch prediction" seems too constrained for me in this context. To generalize,
>>it is about the assumptions a pipelined CPU makes about the address of the next
>>instruction to be executed. Those assumptions can go wrong in pathological cases.
>>And these assumptions are present there even if the code contains *no* branch instructions
>>at all. Technically, a CPU does not need any branch instructions in order to be
>>a universal Turing machine, it only needs instructions for writing to memory from
>>which the CPU reads the code. A jump instruction is in fact a highly specialized memory write instruction.
>>
>>It is pure speculation for an x86 CPU to think that "if the address of the current
>>(non-branch) memory write instruction is ADDR, then the address of the next instruction
>>will be ADDR+1".
>
>I think I can see what you are getting at, but you are wrong too. The address of
>the next instruction definitely will be ADDR+1, but its contents may change.
Yes, I meant the contents of the address ADDR+1. The correct sentence should have been that pipelining makes the assumption: "If ADDR contains a memory write instruction I0 and ADDR+1 contains the value I1, then the instruction executed after I0 will be I1".
>But you're wrong about this as an example of how pipelining requires speculation
>because CPUs disallow such sequences without cache and pipeline draining instructions.
There are two options of how to design a pipelined version of a CPU:
1. Maintain full compatibility between the pipelined and non-pipelined versions, even for all the unlikely corner-cases such as intense self-modifying code. The two CPUs give mathematically identical results in all possible cases.
2. Do not maintain full compatibility. One specimen of this approach are the transitions from 8086 to 286 to 386 to 486 etc.
I was mainly talking about case (1).
From a historical perspective, highly pipelined x86 CPUs appeared after the non-pipelined ones. The idea you are advocating here is in contradiction with this, because the instruction set of the older CPUs had (=past) no idea about pipelining. Of course, now (=present) that we can see "the bigger picture", anyone going to design a non-pipelined CPU will be wise to add some pipeline-management instructions to the instruction set, just in case if there will be a pipelined version. But in case of old CPU designs and old codes, you cannot travel back in time to 1970-ties.
>Exceptions are another one, but again it could have imprecise exceptions and not require speculative execution.
Well, but you cannot go back to 1970 and fix 8086 so that it knows about the concept of imprecise exceptions.
>> Writes to registers are OK from this point of view, since the CPU
>>never fetches an instruction from there. In CPUs which are able to do data speculation,
>>even writing a register might cause partial pipeline stalls. (I don't know why I
>>am writing this here, because it seems obvious.)
>>
>>If you think pipelining in a universal-computation CPU has nothing to do with speculation,
>>you are simply wrong.
>
>On the contrary, I think your assertion that pipelining requires speculative execution is wrong.
I am not saying that. I am saying this: Pipelining, the way it is implemented in current CPUs (e.g: Core2), involves certain assumptions about what patterns will there be when the code is actually executed. Those assumptions are made without any prior attempts to check whether the code actually matches those patterns. The conversation between the CPU and the code looks like:
Code: Hi there, CPU, my friend. I want you to execute me.
CPU: No problem. Hand me the first couple of your instructions.
Code: Only a couple of them? You mean you don't want to see all of them?
CPU: That's right.
Code: But but, then, you aren't going to know what I am going to do.
CPU: Don't worry. I will manage.
Code: How?
CPU: Look pal, I know your kind. You will tell me to execute a bunch of moderately long sequences of adjacent instructions. Plus some branches here and there.
Code: Maybe, but what if you are wrong about me? Your assumptions seem to me like some kind of speculation about my nature.
CPU: Don't be so shy and hand me the initial couple of instructions!
Code: What if I don't ...
CPU: You don't have a choice ...
>> On the other hand, non-speculative pipelining *is* possible,
>>but only if the CPU is able to mathematically prove that a particular piece of code
>>is never violating any assumptions made by the pipelined architecture. But how many
>>existing CPUs are able to do such proofs?
>
>Anecdotal evidence does not make your claim correct.
>
>>Similarly, L1/L2 caches without any traces of speculative-ness whatsoever are also
>>possible - provided the CPU is able to actually prove that the memory access patterns
>>in a particular piece of code are fully known in advance. But how many existing
>>CPUs are able to do such proofs? (Considering the design of the x86 ISA, I cannot
>say I blame them for this inability.)
>
>Caches have nothing to do with speculative execution that you were talking about.
What are you saying? That if a contemporary CPU (Core2 or whatever) decides to allocate a cache-line for data at address 0x1230, it is not making any speculations about future uses of that piece of data?
(Note: I am *not* against caches)
Topic | Posted By | Date |
---|---|---|
A bit off base | John Mann | 2010/07/07 07:04 AM |
A bit off base | David Kanter | 2010/07/07 11:28 AM |
SSE vs x87 | Joel Hruska | 2010/07/07 12:53 PM |
SSE vs x87 | Michael S | 2010/07/07 01:07 PM |
SSE vs x87 | hobold | 2010/07/08 05:12 AM |
SSE vs x87 | David Kanter | 2010/07/07 02:55 PM |
SSE vs x87 | Andi Kleen | 2010/07/08 02:43 AM |
80 bit FP | Ricardo B | 2010/07/08 07:35 AM |
80 bit FP | David Kanter | 2010/07/08 11:14 AM |
80 bit FP | Kevin G | 2010/07/08 02:12 PM |
80 bit FP | Ian Ollmann | 2010/07/19 12:49 AM |
80 bit FP | David Kanter | 2010/07/19 11:33 AM |
80 bit FP | Anil Maliyekkel | 2010/07/19 04:49 PM |
80 bit FP | rwessel | 2010/07/19 05:41 PM |
80 bit FP | Matt Waldhauer | 2010/07/21 11:11 AM |
80 bit FP | Emil Briggs | 2010/07/22 09:06 AM |
A bit off base | John Mann | 2010/07/08 11:06 AM |
A bit off base | David Kanter | 2010/07/08 11:27 AM |
A bit off base | Ian Ameline | 2010/07/09 10:10 AM |
A bit off base | Michael S | 2010/07/10 02:13 PM |
A bit off base | Ian Ameline | 2010/07/11 07:51 AM |
A bit off base | David Kanter | 2010/07/07 09:46 PM |
A bit off base | Anon | 2010/07/08 12:47 AM |
A bit off base | anon | 2010/07/08 02:15 AM |
A bit off base | Gabriele Svelto | 2010/07/08 04:11 AM |
Physics engine history | Peter Clare | 2010/07/08 04:49 AM |
Physics engine history | Null Pointer Exception | 2010/07/08 06:07 AM |
Physics engine history | Ralf | 2010/07/08 03:09 PM |
Physics engine history | David Kanter | 2010/07/08 04:16 PM |
Physics engine history | sJ | 2010/07/08 11:36 PM |
Physics engine history | Gabriele Svelto | 2010/07/09 12:59 AM |
Physics engine history | sJ | 2010/07/13 06:35 AM |
Physics engine history | David Kanter | 2010/07/09 09:25 AM |
Physics engine history | sJ | 2010/07/13 06:49 AM |
Physics engine history | fvdbergh | 2010/07/13 07:27 AM |
A bit off base | John Mann | 2010/07/08 11:11 AM |
A bit off base | David Kanter | 2010/07/08 11:31 AM |
150 GFLOP/s measured? | anon | 2010/07/08 07:10 PM |
150 GFLOP/s measured? | David Kanter | 2010/07/08 07:53 PM |
150 GFLOP/s measured? | Aaron Spink | 2010/07/08 09:05 PM |
150 GFLOP/s measured? | anon | 2010/07/08 09:31 PM |
150 GFLOP/s measured? | Aaron Spink | 2010/07/08 10:43 PM |
150 GFLOP/s measured? | David Kanter | 2010/07/08 11:27 PM |
150 GFLOP/s measured? | Ian Ollmann | 2010/07/19 01:14 AM |
150 GFLOP/s measured? | anon | 2010/07/19 06:39 AM |
150 GFLOP/s measured? | hobold | 2010/07/19 07:26 AM |
Philosophy for achieving peak | David Kanter | 2010/07/19 11:49 AM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 07:36 AM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 08:42 AM |
150 GFLOP/s measured? | Aaron Spink | 2010/07/19 08:56 AM |
150 GFLOP/s measured? | hobold | 2010/07/19 09:30 AM |
150 GFLOP/s measured? | Groo | 2010/07/19 02:31 PM |
150 GFLOP/s measured? | hobold | 2010/07/19 04:17 PM |
150 GFLOP/s measured? | Groo | 2010/07/19 06:18 PM |
150 GFLOP/s measured? | Anon | 2010/07/19 06:18 PM |
150 GFLOP/s measured? | Mark Roulo | 2010/07/19 11:47 AM |
150 GFLOP/s measured? | slacker | 2010/07/19 12:55 PM |
150 GFLOP/s measured? | Mark Roulo | 2010/07/19 01:00 PM |
150 GFLOP/s measured? | anonymous42 | 2010/07/25 12:31 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 12:41 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 02:57 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 04:10 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 04:10 PM |
150 GFLOP/s measured? | hobold | 2010/07/19 04:25 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 04:31 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/20 06:04 AM |
150 GFLOP/s measured? | jrl | 2010/07/20 01:18 AM |
150 GFLOP/s measured? | anonymous42 | 2010/07/25 12:00 PM |
150 GFLOP/s measured? | David Kanter | 2010/07/25 12:52 PM |
150 GFLOP/s measured? | Anon | 2010/07/19 06:15 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 07:27 PM |
150 GFLOP/s measured? | Anon | 2010/07/19 09:54 PM |
150 GFLOP/s measured? | anon | 2010/07/19 11:45 PM |
150 GFLOP/s measured? | hobold | 2010/07/19 09:14 AM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 11:56 AM |
150 GFLOP/s measured? | a reader | 2010/07/21 08:16 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/21 09:05 PM |
150 GFLOP/s measured? | anon | 2010/07/22 02:09 AM |
150 GFLOP/s measured? | a reader | 2010/07/22 07:53 PM |
150 GFLOP/s measured? | gallier2 | 2010/07/23 05:58 AM |
150 GFLOP/s measured? | a reader | 2010/07/25 08:35 AM |
150 GFLOP/s measured? | David Kanter | 2010/07/25 11:49 AM |
150 GFLOP/s measured? | a reader | 2010/07/26 07:03 PM |
150 GFLOP/s measured? | Michael S | 2010/07/28 01:38 AM |
150 GFLOP/s measured? | Gabriele Svelto | 2010/07/28 01:44 AM |
150 GFLOP/s measured? | anon | 2010/07/23 04:55 PM |
150 GFLOP/s measured? | slacker | 2010/07/24 12:48 AM |
150 GFLOP/s measured? | anon | 2010/07/24 02:36 AM |
150 GFLOP/s measured? | Vincent Diepeveen | 2010/07/27 05:37 PM |
150 GFLOP/s measured? | ? | 2010/07/27 11:42 PM |
150 GFLOP/s measured? | slacker | 2010/07/28 05:55 AM |
Intel's clock rate projections | AM | 2010/07/28 02:03 AM |
nostalgia ain't what it used to be | someone | 2010/07/28 05:38 AM |
Intel's clock rate projections | AM | 2010/07/28 10:12 PM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/20 07:19 AM |
Separate the OoO-ness from speculative-ness | Mark Christiansen | 2010/07/20 02:26 PM |
Separate the OoO-ness from speculative-ness | slacker | 2010/07/20 06:04 PM |
Separate the OoO-ness from speculative-ness | Matt Sayler | 2010/07/20 06:10 PM |
Separate the OoO-ness from speculative-ness | slacker | 2010/07/20 09:37 PM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/20 11:51 PM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/21 02:16 AM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/21 07:05 AM |
Software conventions | Paul A. Clayton | 2010/07/21 08:52 AM |
Software conventions | ? | 2010/07/22 05:43 AM |
Speculation | David Kanter | 2010/07/21 10:32 AM |
Pipelining affects the ISA | ? | 2010/07/22 10:58 PM |
Pipelining affects the ISA | ? | 2010/07/22 11:14 PM |
Pipelining affects the ISA | rwessel | 2010/07/23 12:03 AM |
Pipelining affects the ISA | ? | 2010/07/23 05:50 AM |
Pipelining affects the ISA | ? | 2010/07/23 06:10 AM |
Pipelining affects the ISA | Thiago Kurovski | 2010/07/23 02:59 PM |
Pipelining affects the ISA | anon | 2010/07/24 07:35 AM |
Pipelining affects the ISA | Thiago Kurovski | 2010/07/24 11:12 AM |
Pipelining affects the ISA | Gabriele Svelto | 2010/07/26 02:50 AM |
Pipelining affects the ISA | IlleglWpns | 2010/07/26 05:14 AM |
Pipelining affects the ISA | Michael S | 2010/07/26 03:33 PM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/21 05:53 PM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/22 04:15 AM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/22 04:27 AM |
Separate the OoO-ness from speculative-ness | slacker | 2010/07/21 07:45 PM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/22 01:57 AM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/22 05:26 AM |
Separate the OoO-ness from speculative-ness | Dan Downs | 2010/07/22 08:14 AM |
Confusing and not very useful definition | David Kanter | 2010/07/22 12:41 PM |
Confusing and not very useful definition | ? | 2010/07/22 10:58 PM |
Confusing and not very useful definition | Ungo | 2010/07/24 12:06 PM |
Confusing and not very useful definition | ? | 2010/07/25 10:23 PM |
Separate the OoO-ness from speculative-ness | someone | 2010/07/20 08:02 PM |
Separate the OoO-ness from speculative-ness | Thiago Kurovski | 2010/07/21 04:13 PM |
You are just quoting SINGLE precision flops? OMG what planet do you live? | Vincent Diepeveen | 2010/07/19 10:26 AM |
The prior poster was talking about SP (NT) | David Kanter | 2010/07/19 11:34 AM |
All FFT's need double precision | Vincent Diepeveen | 2010/07/19 02:02 PM |
All FFT's need double precision | David Kanter | 2010/07/19 02:09 PM |
All FFT's need double precision | Vincent Diepeveen | 2010/07/19 04:06 PM |
All FFT's need double precision - not | Michael S | 2010/07/20 01:16 AM |
All FFT's need double precision - not | Ungo | 2010/07/21 12:04 AM |
All FFT's need double precision - not | Michael S | 2010/07/21 02:35 PM |
All FFT's need double precision - not | EduardoS | 2010/07/21 02:52 PM |
All FFT's need double precision - not | Anon | 2010/07/21 05:23 PM |
All FFT's need double precision - not | Ricardo B | 2010/07/26 07:46 AM |
I'm on a boat! | anon | 2010/07/22 11:42 AM |
All FFT's need double precision - not | Vincent Diepeveen | 2010/07/24 11:39 PM |
All FFT's need double precision - not | slacker | 2010/07/25 03:27 AM |
All FFT's need double precision - not | Ricardo B | 2010/07/26 07:40 AM |
All FFT's need double precision - not | EduardoS | 2010/07/25 08:37 AM |
All FFT's need double precision - not | Michael S | 2010/07/25 10:43 AM |
All FFT's need double precision - not | Vincent Diepeveen | 2010/07/24 11:19 PM |
A bit off base | EduardoS | 2010/07/08 04:08 PM |
A bit off base | Groo | 2010/07/08 06:11 PM |
A bit off base | john mann | 2010/07/08 06:58 PM |
All right...let's cool it... | David Kanter | 2010/07/08 07:54 PM |
A bit off base | Vincent Diepeveen | 2010/07/19 03:36 PM |