Pipelining affects the ISA

Article: PhysX87: Software Deficiency
By: ? (0xe2.0x9a.0x9b.delete@this.gmail.com), July 23, 2010 5:50 am
Room: Moderated Discussions
rwessel (robertwessel@yahoo.com) on 7/23/10 wrote:
>? (0xe2.0x9a.0x9b@gmail.com) on 7/22/10 wrote:
>>David Kanter (dkanter@realworldtech.com) on 7/21/10 wrote:
>>>>Well, but you cannot go back to 1970 and fix 8086 so that >it knows about the concept of imprecise exceptions.
>>>What pipeline management instructions would you want?
>>Maybe this isn't going to exactly answer the question you are asking, but it is fairly close in general:
>>Well, for example it seems clear to me that the ISA of a pipelined CPU should have
>>two flags registers. One for doing computations which are feeding results into branch
>>instructions and the other one for doing computations unrelated to branching. Take this code for example:

>>for(int i=0; i<1000; i++)
>>a += array[i]

>>The corresponding x86 code (one flags-register) is:

>>EAX := array
>>EBX := a
>>CMP ECX,1000
>>JMP_smaller NEXT

>>Notice that you cannot reorder the ADD and CMP because it works with the-same-and-only-one
>>FLAGS register. (If you force the reordering and save the FLAGS into memory, it will do more harm than good.)
>>If you have two flag registers (F1,F2):

>>CMP[F1] ECX,1000
>>JMP_smaller[F1] NEXT

>>The distinction is that in the latter case CMP happens one cycle sooner than in
>>the former case. This may not sound like much, but in a pipelined CPU it means that
>>the jump's target will be resolved (as in: known for a fact) one cycle sooner.
>>Intel's Core2 with CMP+JMP fusion? That seems like a bad joke. Well, it speeds
>>up x86 code that was compiled for the model with only one FLAGS register, no doubts
>>about that. But as can be seen in the latter asm code, CMP+JMP fusion does not make
>>much sense once you have multiple FLAGS registers.
>>What about AMD's Bulldozer? It seems they are going to just blindly copy the CMP+JMP
>>fusion concept (to speed up existing codes). But are they going to e.g. introduce
>>a new instruction prefix for selecting the FLAGS register, bring some innovation into this field and beat Intel? Nooo.
>>When Intel was designing the 386, they most likely already knew it is going to
>>be a pipelined CPU. They could have put two FLAGS registers in there. Call me a
>>skeptic, but I think the reason why the ISA has only one such register was that
>>the ISA designers had no idea what they were doing (at least in this particular case).
>>One of the main consequences is that having two flags registers lowers the pressure
>>put on the branch prediction engine. There would even exist cases in which you can
>>keep the pipeline fully utilized in the presence of conditional branch instructions
>>in the code - without resorting to any speculations about the most likely branch targets ...
>Well, PPC already does that, and it does help code in some cases, but it doesn't
>seem like a major win, and no one is really rushing to copy that. Of course the
>RISC approach of not having a flags register at all, is another solution.
>Of more general use than a duplicate flags register would be more instructions
>that don't set the flags. And within the existing scope of x86, you can do that
>yourself (by replacing the add with a mov/lea sequence), or minimize the effect
>by unrolling. Of course those techniques might not apply to any particular case,
>but they do to this one, which goes to show that short code snippets are not worth
>too much when it comes to making architectural decisions.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
A bit off baseJohn Mann2010/07/07 07:04 AM
  A bit off baseDavid Kanter2010/07/07 11:28 AM
    SSE vs x87Joel Hruska2010/07/07 12:53 PM
      SSE vs x87Michael S2010/07/07 01:07 PM
        SSE vs x87hobold2010/07/08 05:12 AM
      SSE vs x87David Kanter2010/07/07 02:55 PM
        SSE vs x87Andi Kleen2010/07/08 02:43 AM
          80 bit FPRicardo B2010/07/08 07:35 AM
            80 bit FPDavid Kanter2010/07/08 11:14 AM
              80 bit FPKevin G2010/07/08 02:12 PM
                80 bit FPIan Ollmann2010/07/19 12:49 AM
                  80 bit FPDavid Kanter2010/07/19 11:33 AM
                    80 bit FPAnil Maliyekkel2010/07/19 04:49 PM
                      80 bit FPrwessel2010/07/19 05:41 PM
                    80 bit FPMatt Waldhauer2010/07/21 11:11 AM
            80 bit FPEmil Briggs2010/07/22 09:06 AM
    A bit off baseJohn Mann2010/07/08 11:06 AM
      A bit off baseDavid Kanter2010/07/08 11:27 AM
        A bit off baseIan Ameline2010/07/09 10:10 AM
          A bit off baseMichael S2010/07/10 02:13 PM
            A bit off baseIan Ameline2010/07/11 07:51 AM
  A bit off baseDavid Kanter2010/07/07 09:46 PM
    A bit off baseAnon2010/07/08 12:47 AM
      A bit off baseanon2010/07/08 02:15 AM
        A bit off baseGabriele Svelto2010/07/08 04:11 AM
          Physics engine historyPeter Clare2010/07/08 04:49 AM
            Physics engine historyNull Pointer Exception2010/07/08 06:07 AM
              Physics engine historyRalf2010/07/08 03:09 PM
                Physics engine historyDavid Kanter2010/07/08 04:16 PM
                  Physics engine historysJ2010/07/08 11:36 PM
                    Physics engine historyGabriele Svelto2010/07/09 12:59 AM
                      Physics engine historysJ2010/07/13 06:35 AM
                    Physics engine historyDavid Kanter2010/07/09 09:25 AM
                      Physics engine historysJ2010/07/13 06:49 AM
                      Physics engine historyfvdbergh2010/07/13 07:27 AM
    A bit off baseJohn Mann2010/07/08 11:11 AM
      A bit off baseDavid Kanter2010/07/08 11:31 AM
        150 GFLOP/s measured?anon2010/07/08 07:10 PM
          150 GFLOP/s measured?David Kanter2010/07/08 07:53 PM
            150 GFLOP/s measured?Aaron Spink2010/07/08 09:05 PM
              150 GFLOP/s measured?anon2010/07/08 09:31 PM
                150 GFLOP/s measured?Aaron Spink2010/07/08 10:43 PM
                  150 GFLOP/s measured?David Kanter2010/07/08 11:27 PM
                    150 GFLOP/s measured?Ian Ollmann2010/07/19 01:14 AM
                      150 GFLOP/s measured?anon2010/07/19 06:39 AM
                        150 GFLOP/s measured?hobold2010/07/19 07:26 AM
                          Philosophy for achieving peakDavid Kanter2010/07/19 11:49 AM
                      150 GFLOP/s measured?Linus Torvalds2010/07/19 07:36 AM
                        150 GFLOP/s measured?Richard Cownie2010/07/19 08:42 AM
                          150 GFLOP/s measured?Aaron Spink2010/07/19 08:56 AM
                            150 GFLOP/s measured?hobold2010/07/19 09:30 AM
                              150 GFLOP/s measured?Groo2010/07/19 02:31 PM
                                150 GFLOP/s measured?hobold2010/07/19 04:17 PM
                                  150 GFLOP/s measured?Groo2010/07/19 06:18 PM
                              150 GFLOP/s measured?Anon2010/07/19 06:18 PM
                            150 GFLOP/s measured?Mark Roulo2010/07/19 11:47 AM
                              150 GFLOP/s measured?slacker2010/07/19 12:55 PM
                                150 GFLOP/s measured?Mark Roulo2010/07/19 01:00 PM
                              150 GFLOP/s measured?anonymous422010/07/25 12:31 PM
                            150 GFLOP/s measured?Richard Cownie2010/07/19 12:41 PM
                              150 GFLOP/s measured?Linus Torvalds2010/07/19 02:57 PM
                                150 GFLOP/s measured?Richard Cownie2010/07/19 04:10 PM
                                150 GFLOP/s measured?Richard Cownie2010/07/19 04:10 PM
                                  150 GFLOP/s measured?hobold2010/07/19 04:25 PM
                                  150 GFLOP/s measured?Linus Torvalds2010/07/19 04:31 PM
                                    150 GFLOP/s measured?Richard Cownie2010/07/20 06:04 AM
                                150 GFLOP/s measured?jrl2010/07/20 01:18 AM
                            150 GFLOP/s measured?anonymous422010/07/25 12:00 PM
                              150 GFLOP/s measured?David Kanter2010/07/25 12:52 PM
                          150 GFLOP/s measured?Anon2010/07/19 06:15 PM
                            150 GFLOP/s measured?Linus Torvalds2010/07/19 07:27 PM
                              150 GFLOP/s measured?Anon2010/07/19 09:54 PM
                                150 GFLOP/s measured?anon2010/07/19 11:45 PM
                        150 GFLOP/s measured?hobold2010/07/19 09:14 AM
                          150 GFLOP/s measured?Linus Torvalds2010/07/19 11:56 AM
                            150 GFLOP/s measured?a reader2010/07/21 08:16 PM
                              150 GFLOP/s measured?Linus Torvalds2010/07/21 09:05 PM
                                150 GFLOP/s measured?anon2010/07/22 02:09 AM
                                  150 GFLOP/s measured?a reader2010/07/22 07:53 PM
                                    150 GFLOP/s measured?gallier22010/07/23 05:58 AM
                                      150 GFLOP/s measured?a reader2010/07/25 08:35 AM
                                        150 GFLOP/s measured?David Kanter2010/07/25 11:49 AM
                                          150 GFLOP/s measured?a reader2010/07/26 07:03 PM
                                            150 GFLOP/s measured?Michael S2010/07/28 01:38 AM
                                              150 GFLOP/s measured?Gabriele Svelto2010/07/28 01:44 AM
                                    150 GFLOP/s measured?anon2010/07/23 04:55 PM
                                      150 GFLOP/s measured?slacker2010/07/24 12:48 AM
                                        150 GFLOP/s measured?anon2010/07/24 02:36 AM
                                    150 GFLOP/s measured?Vincent Diepeveen2010/07/27 05:37 PM
                                      150 GFLOP/s measured??2010/07/27 11:42 PM
                                        150 GFLOP/s measured?slacker2010/07/28 05:55 AM
                                      Intel's clock rate projectionsAM2010/07/28 02:03 AM
                                        nostalgia ain't what it used to besomeone2010/07/28 05:38 AM
                                          Intel's clock rate projectionsAM2010/07/28 10:12 PM
                        Separate the OoO-ness from speculative-ness?2010/07/20 07:19 AM
                          Separate the OoO-ness from speculative-nessMark Christiansen2010/07/20 02:26 PM
                          Separate the OoO-ness from speculative-nessslacker2010/07/20 06:04 PM
                            Separate the OoO-ness from speculative-nessMatt Sayler2010/07/20 06:10 PM
                              Separate the OoO-ness from speculative-nessslacker2010/07/20 09:37 PM
                                Separate the OoO-ness from speculative-ness?2010/07/20 11:51 PM
                                  Separate the OoO-ness from speculative-nessanon2010/07/21 02:16 AM
                                    Separate the OoO-ness from speculative-ness?2010/07/21 07:05 AM
                                      Software conventionsPaul A. Clayton2010/07/21 08:52 AM
                                        Software conventions?2010/07/22 05:43 AM
                                      SpeculationDavid Kanter2010/07/21 10:32 AM
                                        Pipelining affects the ISA?2010/07/22 10:58 PM
                                          Pipelining affects the ISA?2010/07/22 11:14 PM
                                          Pipelining affects the ISArwessel2010/07/23 12:03 AM
                                            Pipelining affects the ISA?2010/07/23 05:50 AM
                                            Pipelining affects the ISA?2010/07/23 06:10 AM
                                              Pipelining affects the ISAThiago Kurovski2010/07/23 02:59 PM
                                                Pipelining affects the ISAanon2010/07/24 07:35 AM
                                                  Pipelining affects the ISAThiago Kurovski2010/07/24 11:12 AM
                                          Pipelining affects the ISAGabriele Svelto2010/07/26 02:50 AM
                                            Pipelining affects the ISAIlleglWpns2010/07/26 05:14 AM
                                              Pipelining affects the ISAMichael S2010/07/26 03:33 PM
                                      Separate the OoO-ness from speculative-nessanon2010/07/21 05:53 PM
                                        Separate the OoO-ness from speculative-ness?2010/07/22 04:15 AM
                                          Separate the OoO-ness from speculative-nessanon2010/07/22 04:27 AM
                                      Separate the OoO-ness from speculative-nessslacker2010/07/21 07:45 PM
                                        Separate the OoO-ness from speculative-nessanon2010/07/22 01:57 AM
                                        Separate the OoO-ness from speculative-ness?2010/07/22 05:26 AM
                                          Separate the OoO-ness from speculative-nessDan Downs2010/07/22 08:14 AM
                                          Confusing and not very useful definitionDavid Kanter2010/07/22 12:41 PM
                                            Confusing and not very useful definition?2010/07/22 10:58 PM
                                              Confusing and not very useful definitionUngo2010/07/24 12:06 PM
                                                Confusing and not very useful definition?2010/07/25 10:23 PM
                            Separate the OoO-ness from speculative-nesssomeone2010/07/20 08:02 PM
                              Separate the OoO-ness from speculative-nessThiago Kurovski2010/07/21 04:13 PM
            You are just quoting SINGLE precision flops? OMG what planet do you live? Vincent Diepeveen2010/07/19 10:26 AM
              The prior poster was talking about SP (NT)David Kanter2010/07/19 11:34 AM
                All FFT's need double precisionVincent Diepeveen2010/07/19 02:02 PM
                  All FFT's need double precisionDavid Kanter2010/07/19 02:09 PM
                    All FFT's need double precisionVincent Diepeveen2010/07/19 04:06 PM
                  All FFT's need double precision - notMichael S2010/07/20 01:16 AM
                    All FFT's need double precision - notUngo2010/07/21 12:04 AM
                      All FFT's need double precision - notMichael S2010/07/21 02:35 PM
                      All FFT's need double precision - notEduardoS2010/07/21 02:52 PM
                        All FFT's need double precision - notAnon2010/07/21 05:23 PM
                          All FFT's need double precision - notRicardo B2010/07/26 07:46 AM
                        I'm on a boat!anon2010/07/22 11:42 AM
                        All FFT's need double precision - notVincent Diepeveen2010/07/24 11:39 PM
                          All FFT's need double precision - notslacker2010/07/25 03:27 AM
                            All FFT's need double precision - notRicardo B2010/07/26 07:40 AM
                          All FFT's need double precision - notEduardoS2010/07/25 08:37 AM
                            All FFT's need double precision - notMichael S2010/07/25 10:43 AM
                    All FFT's need double precision - notVincent Diepeveen2010/07/24 11:19 PM
      A bit off baseEduardoS2010/07/08 04:08 PM
        A bit off baseGroo2010/07/08 06:11 PM
          A bit off basejohn mann2010/07/08 06:58 PM
            All right...let's cool it...David Kanter2010/07/08 07:54 PM
    A bit off baseVincent Diepeveen2010/07/19 03:36 PM
Reply to this Topic
Body: No Text
How do you spell avocado?