Article: PhysX87: Software Deficiency
By: Gabriele Svelto (gabriele.svelto.delete@this.gmail.com), July 26, 2010 2:50 am
Room: Moderated Discussions
? (0xe2.0x9a.0x9b@gmail.com) on 7/22/10 wrote:
---------------------------
>Notice that you cannot reorder the ADD and CMP because it works with the-same-and-only-one
>FLAGS register. (If you force the reordering and save the FLAGS into memory, it will do more harm than good.)
I am not sure if the current x86 implementation do it but there's an architectural way of solving this problem: renaming the FLAGS register so WAW hazards go away and you can actually execute the relevant instruction way ahead of the branch. Recent POWER processors do rename their condition registers in spite of having 8 of them so I guess x86 is also doing it even though I have no hard data to back my guess.
---------------------------
>Notice that you cannot reorder the ADD and CMP because it works with the-same-and-only-one
>FLAGS register. (If you force the reordering and save the FLAGS into memory, it will do more harm than good.)
I am not sure if the current x86 implementation do it but there's an architectural way of solving this problem: renaming the FLAGS register so WAW hazards go away and you can actually execute the relevant instruction way ahead of the branch. Recent POWER processors do rename their condition registers in spite of having 8 of them so I guess x86 is also doing it even though I have no hard data to back my guess.
Topic | Posted By | Date |
---|---|---|
A bit off base | John Mann | 2010/07/07 07:04 AM |
A bit off base | David Kanter | 2010/07/07 11:28 AM |
SSE vs x87 | Joel Hruska | 2010/07/07 12:53 PM |
SSE vs x87 | Michael S | 2010/07/07 01:07 PM |
SSE vs x87 | hobold | 2010/07/08 05:12 AM |
SSE vs x87 | David Kanter | 2010/07/07 02:55 PM |
SSE vs x87 | Andi Kleen | 2010/07/08 02:43 AM |
80 bit FP | Ricardo B | 2010/07/08 07:35 AM |
80 bit FP | David Kanter | 2010/07/08 11:14 AM |
80 bit FP | Kevin G | 2010/07/08 02:12 PM |
80 bit FP | Ian Ollmann | 2010/07/19 12:49 AM |
80 bit FP | David Kanter | 2010/07/19 11:33 AM |
80 bit FP | Anil Maliyekkel | 2010/07/19 04:49 PM |
80 bit FP | rwessel | 2010/07/19 05:41 PM |
80 bit FP | Matt Waldhauer | 2010/07/21 11:11 AM |
80 bit FP | Emil Briggs | 2010/07/22 09:06 AM |
A bit off base | John Mann | 2010/07/08 11:06 AM |
A bit off base | David Kanter | 2010/07/08 11:27 AM |
A bit off base | Ian Ameline | 2010/07/09 10:10 AM |
A bit off base | Michael S | 2010/07/10 02:13 PM |
A bit off base | Ian Ameline | 2010/07/11 07:51 AM |
A bit off base | David Kanter | 2010/07/07 09:46 PM |
A bit off base | Anon | 2010/07/08 12:47 AM |
A bit off base | anon | 2010/07/08 02:15 AM |
A bit off base | Gabriele Svelto | 2010/07/08 04:11 AM |
Physics engine history | Peter Clare | 2010/07/08 04:49 AM |
Physics engine history | Null Pointer Exception | 2010/07/08 06:07 AM |
Physics engine history | Ralf | 2010/07/08 03:09 PM |
Physics engine history | David Kanter | 2010/07/08 04:16 PM |
Physics engine history | sJ | 2010/07/08 11:36 PM |
Physics engine history | Gabriele Svelto | 2010/07/09 12:59 AM |
Physics engine history | sJ | 2010/07/13 06:35 AM |
Physics engine history | David Kanter | 2010/07/09 09:25 AM |
Physics engine history | sJ | 2010/07/13 06:49 AM |
Physics engine history | fvdbergh | 2010/07/13 07:27 AM |
A bit off base | John Mann | 2010/07/08 11:11 AM |
A bit off base | David Kanter | 2010/07/08 11:31 AM |
150 GFLOP/s measured? | anon | 2010/07/08 07:10 PM |
150 GFLOP/s measured? | David Kanter | 2010/07/08 07:53 PM |
150 GFLOP/s measured? | Aaron Spink | 2010/07/08 09:05 PM |
150 GFLOP/s measured? | anon | 2010/07/08 09:31 PM |
150 GFLOP/s measured? | Aaron Spink | 2010/07/08 10:43 PM |
150 GFLOP/s measured? | David Kanter | 2010/07/08 11:27 PM |
150 GFLOP/s measured? | Ian Ollmann | 2010/07/19 01:14 AM |
150 GFLOP/s measured? | anon | 2010/07/19 06:39 AM |
150 GFLOP/s measured? | hobold | 2010/07/19 07:26 AM |
Philosophy for achieving peak | David Kanter | 2010/07/19 11:49 AM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 07:36 AM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 08:42 AM |
150 GFLOP/s measured? | Aaron Spink | 2010/07/19 08:56 AM |
150 GFLOP/s measured? | hobold | 2010/07/19 09:30 AM |
150 GFLOP/s measured? | Groo | 2010/07/19 02:31 PM |
150 GFLOP/s measured? | hobold | 2010/07/19 04:17 PM |
150 GFLOP/s measured? | Groo | 2010/07/19 06:18 PM |
150 GFLOP/s measured? | Anon | 2010/07/19 06:18 PM |
150 GFLOP/s measured? | Mark Roulo | 2010/07/19 11:47 AM |
150 GFLOP/s measured? | slacker | 2010/07/19 12:55 PM |
150 GFLOP/s measured? | Mark Roulo | 2010/07/19 01:00 PM |
150 GFLOP/s measured? | anonymous42 | 2010/07/25 12:31 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 12:41 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 02:57 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 04:10 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/19 04:10 PM |
150 GFLOP/s measured? | hobold | 2010/07/19 04:25 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 04:31 PM |
150 GFLOP/s measured? | Richard Cownie | 2010/07/20 06:04 AM |
150 GFLOP/s measured? | jrl | 2010/07/20 01:18 AM |
150 GFLOP/s measured? | anonymous42 | 2010/07/25 12:00 PM |
150 GFLOP/s measured? | David Kanter | 2010/07/25 12:52 PM |
150 GFLOP/s measured? | Anon | 2010/07/19 06:15 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 07:27 PM |
150 GFLOP/s measured? | Anon | 2010/07/19 09:54 PM |
150 GFLOP/s measured? | anon | 2010/07/19 11:45 PM |
150 GFLOP/s measured? | hobold | 2010/07/19 09:14 AM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/19 11:56 AM |
150 GFLOP/s measured? | a reader | 2010/07/21 08:16 PM |
150 GFLOP/s measured? | Linus Torvalds | 2010/07/21 09:05 PM |
150 GFLOP/s measured? | anon | 2010/07/22 02:09 AM |
150 GFLOP/s measured? | a reader | 2010/07/22 07:53 PM |
150 GFLOP/s measured? | gallier2 | 2010/07/23 05:58 AM |
150 GFLOP/s measured? | a reader | 2010/07/25 08:35 AM |
150 GFLOP/s measured? | David Kanter | 2010/07/25 11:49 AM |
150 GFLOP/s measured? | a reader | 2010/07/26 07:03 PM |
150 GFLOP/s measured? | Michael S | 2010/07/28 01:38 AM |
150 GFLOP/s measured? | Gabriele Svelto | 2010/07/28 01:44 AM |
150 GFLOP/s measured? | anon | 2010/07/23 04:55 PM |
150 GFLOP/s measured? | slacker | 2010/07/24 12:48 AM |
150 GFLOP/s measured? | anon | 2010/07/24 02:36 AM |
150 GFLOP/s measured? | Vincent Diepeveen | 2010/07/27 05:37 PM |
150 GFLOP/s measured? | ? | 2010/07/27 11:42 PM |
150 GFLOP/s measured? | slacker | 2010/07/28 05:55 AM |
Intel's clock rate projections | AM | 2010/07/28 02:03 AM |
nostalgia ain't what it used to be | someone | 2010/07/28 05:38 AM |
Intel's clock rate projections | AM | 2010/07/28 10:12 PM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/20 07:19 AM |
Separate the OoO-ness from speculative-ness | Mark Christiansen | 2010/07/20 02:26 PM |
Separate the OoO-ness from speculative-ness | slacker | 2010/07/20 06:04 PM |
Separate the OoO-ness from speculative-ness | Matt Sayler | 2010/07/20 06:10 PM |
Separate the OoO-ness from speculative-ness | slacker | 2010/07/20 09:37 PM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/20 11:51 PM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/21 02:16 AM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/21 07:05 AM |
Software conventions | Paul A. Clayton | 2010/07/21 08:52 AM |
Software conventions | ? | 2010/07/22 05:43 AM |
Speculation | David Kanter | 2010/07/21 10:32 AM |
Pipelining affects the ISA | ? | 2010/07/22 10:58 PM |
Pipelining affects the ISA | ? | 2010/07/22 11:14 PM |
Pipelining affects the ISA | rwessel | 2010/07/23 12:03 AM |
Pipelining affects the ISA | ? | 2010/07/23 05:50 AM |
Pipelining affects the ISA | ? | 2010/07/23 06:10 AM |
Pipelining affects the ISA | Thiago Kurovski | 2010/07/23 02:59 PM |
Pipelining affects the ISA | anon | 2010/07/24 07:35 AM |
Pipelining affects the ISA | Thiago Kurovski | 2010/07/24 11:12 AM |
Pipelining affects the ISA | Gabriele Svelto | 2010/07/26 02:50 AM |
Pipelining affects the ISA | IlleglWpns | 2010/07/26 05:14 AM |
Pipelining affects the ISA | Michael S | 2010/07/26 03:33 PM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/21 05:53 PM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/22 04:15 AM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/22 04:27 AM |
Separate the OoO-ness from speculative-ness | slacker | 2010/07/21 07:45 PM |
Separate the OoO-ness from speculative-ness | anon | 2010/07/22 01:57 AM |
Separate the OoO-ness from speculative-ness | ? | 2010/07/22 05:26 AM |
Separate the OoO-ness from speculative-ness | Dan Downs | 2010/07/22 08:14 AM |
Confusing and not very useful definition | David Kanter | 2010/07/22 12:41 PM |
Confusing and not very useful definition | ? | 2010/07/22 10:58 PM |
Confusing and not very useful definition | Ungo | 2010/07/24 12:06 PM |
Confusing and not very useful definition | ? | 2010/07/25 10:23 PM |
Separate the OoO-ness from speculative-ness | someone | 2010/07/20 08:02 PM |
Separate the OoO-ness from speculative-ness | Thiago Kurovski | 2010/07/21 04:13 PM |
You are just quoting SINGLE precision flops? OMG what planet do you live? | Vincent Diepeveen | 2010/07/19 10:26 AM |
The prior poster was talking about SP (NT) | David Kanter | 2010/07/19 11:34 AM |
All FFT's need double precision | Vincent Diepeveen | 2010/07/19 02:02 PM |
All FFT's need double precision | David Kanter | 2010/07/19 02:09 PM |
All FFT's need double precision | Vincent Diepeveen | 2010/07/19 04:06 PM |
All FFT's need double precision - not | Michael S | 2010/07/20 01:16 AM |
All FFT's need double precision - not | Ungo | 2010/07/21 12:04 AM |
All FFT's need double precision - not | Michael S | 2010/07/21 02:35 PM |
All FFT's need double precision - not | EduardoS | 2010/07/21 02:52 PM |
All FFT's need double precision - not | Anon | 2010/07/21 05:23 PM |
All FFT's need double precision - not | Ricardo B | 2010/07/26 07:46 AM |
I'm on a boat! | anon | 2010/07/22 11:42 AM |
All FFT's need double precision - not | Vincent Diepeveen | 2010/07/24 11:39 PM |
All FFT's need double precision - not | slacker | 2010/07/25 03:27 AM |
All FFT's need double precision - not | Ricardo B | 2010/07/26 07:40 AM |
All FFT's need double precision - not | EduardoS | 2010/07/25 08:37 AM |
All FFT's need double precision - not | Michael S | 2010/07/25 10:43 AM |
All FFT's need double precision - not | Vincent Diepeveen | 2010/07/24 11:19 PM |
A bit off base | EduardoS | 2010/07/08 04:08 PM |
A bit off base | Groo | 2010/07/08 06:11 PM |
A bit off base | john mann | 2010/07/08 06:58 PM |
All right...let's cool it... | David Kanter | 2010/07/08 07:54 PM |
A bit off base | Vincent Diepeveen | 2010/07/19 03:36 PM |