Move elimination can be a µop fusion

Article: Intel's Haswell CPU Microarchitecture
By: Stubabe (nospam.delete@this.nospam.com), November 14, 2012 12:43 pm
Room: Moderated Discussions
Paul A. Clayton (paaronclayton.delete@this.gmail.com) on November 14, 2012 6:41 am wrote:
> anon (anon.delete@this.anon.com) on November 14, 2012 5:10 am wrote:
> [snip]
> > When it is said that the front-end handles simple reg,reg moves
> > and saves OOOE resources, what does this mean exactly?
> >
> > Presumably such instruction has to be at least tracked in the ROB somehow. So it may save a physical
> > register and an execution unit, but it's not entirely eliminated from OOOE part. Or am I way off base?
>
> A move that occurs immediately before an instruction which uses the destination of the move
> as a source/destination can effectively be fused with the later instruction. E.g.:
>
> mov R9 add R10
> can be transformed into:
>
> add R10
> which could occupy a single ROB entry. (Note that each instruction need not maintain a unique
> ROB entry; ISTR that POWER4 hold up to four instructions plus a branch in each ROB entry.)
>
> Unrestricted move elimination would be more complex. However, even if such requires a ROB entry, it removes
> the move from the dependence chain--zero-cycle move--(as well as saving a register, temporary use of a scheduler
> slot, execution unit use, and possibly register file reads). Avoiding execution saves power, of course.
>
>

Surely it's just handled by the Register Alias Table?

i.e.
if PR100 (physical register) holds R9, PR90 holds R15 and PR101 is the next free reg

Without MOV elimination
-----------------------
MOV R10 <- R9
R10 would allocate PR101 and R9 renames to PR100
ADD R10, R10 R15
R10 would allocate PR102, R10 renames to PR101 and R15 to PR90

so the sequence becomes:
uMOV PR101, PR100
uADD PR102, PR101, PR90

With MOV elimination
--------------------
MOV R10 <- R9
R10 would now rename to PR100 as well as R9
ADD R10, R10 R15
R10 would allocate PR101, R10 renames to PR100 and R15 to PR90

so the sequence becomes:
uADD PR101, PR100, PR90

This way the is no need to fuse adjacent instructions, R10 can have multiple dependences and the instruction can get NOPed
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Haswell CPU article onlineDavid Kanter11/13/12 02:43 PM
  Haswell CPU article onlineEric11/13/12 03:10 PM
    Haswell CPU article onlinehobold11/13/12 04:13 PM
      Haswell CPU article onlineRicardo B11/13/12 05:09 PM
    Haswell CPU article onlineanonymou511/13/12 04:44 PM
      Haswell CPU article onlinenone11/14/12 02:40 AM
  Haswell CPU article onlinetarlinian11/13/12 03:56 PM
    Fixed (NT)David Kanter11/13/12 05:06 PM
      Haswell CPU article onlineJacob Marley11/14/12 01:18 AM
  Haswell CPU article onlinerandomshinichi11/14/12 01:53 AM
    LLC == Last Level Cache (usually L3) (NT)Paul A. Clayton11/14/12 04:50 AM
    Haswell CPU article onlineJoe11/14/12 09:38 AM
      LLC vs. L3 vs. L4David Kanter11/14/12 10:09 AM
        LLC vs. L3 vs. L4; LLC = Link Layer ControllerRay11/14/12 09:08 PM
          A pit there are only 17000 TLAs... (NT)EduardoS11/15/12 02:14 AM
  Haswell CPU article onlineanon11/14/12 04:10 AM
    Move elimination can be a µop fusionPaul A. Clayton11/14/12 05:41 AM
      That should be "mov R10 <- R9"! (NT)Paul A. Clayton11/14/12 05:43 AM
      Move elimination can be a µop fusionanon11/14/12 06:25 AM
        It does avoid the scheduler (NT)Paul A. Clayton11/14/12 07:47 AM
      Move elimination can be a µop fusionStubabe11/14/12 12:43 PM
        Move elimination can be a µop fusionanon11/14/12 08:33 PM
          Move elimination can be a µop fusionFelid11/14/12 11:49 PM
            Move elimination can be a µop fusionanon11/15/12 12:23 AM
              Move elimination can be a µop fusionStuart11/15/12 04:04 AM
                Move elimination can be a µop fusionStubabe11/15/12 04:14 AM
                  Move elimination can be a µop fusionanon11/15/12 04:48 AM
                    Move elimination can be a µop fusionEduardoS11/15/12 05:00 AM
                      Move elimination can be a µop fusionanon11/15/12 05:14 AM
                        Move elimination can be a µop fusionEduardoS11/15/12 05:21 AM
                          Move elimination can be a µop fusionanon11/15/12 05:31 AM
                    Move elimination can be a µop fusionStubabe11/15/12 10:38 AM
                      There can be only one dependencePaul A. Clayton11/15/12 11:50 AM
                    Move elimination can be a µop fusionFelid11/15/12 02:19 PM
                      Move elimination can be a µop fusionanon11/16/12 03:07 AM
                        Move elimination can be a µop fusionFelid11/16/12 06:43 PM
                  Move elimination can be a µop fusionFelid11/15/12 01:50 PM
                    Move elimination can be a µop fusionFelid11/15/12 02:03 PM
                      Correction!Felid11/19/12 12:23 AM
                    Thanks, I wasn't aware of the change in SB. Good to know... (NT)Stubabe11/15/12 02:43 PM
            Move fusion assumes adjacencyPaul A. Clayton11/15/12 06:15 AM
              Move fusion assumes adjacencyFelid11/15/12 01:40 PM
        Move elimination can be a µop fusionPatrick Chase11/21/12 10:52 AM
          Move elimination can be a µop fusionPatrick Chase11/21/12 11:12 AM
    Haswell CPU article onlineRicardo B11/14/12 08:12 AM
  Haswell CPU article onlinegmb11/14/12 07:28 AM
  Haswell CPU article onlineFelid11/14/12 10:58 PM
    Haswell CPU article onlineDavid Kanter11/15/12 08:59 AM
      Haswell CPU article onlineFelid11/15/12 01:15 PM
        Instruction queueDavid Kanter11/16/12 11:23 AM
          Instruction queueFelid11/16/12 12:05 PM
  128-bit division unit?Eric Bron11/16/12 03:57 AM
    128-bit division unit?David Kanter11/16/12 07:59 AM
      128-bit division unit?Eric Bron11/16/12 08:47 AM
        128-bit division unit?Felid11/16/12 11:46 AM
          128-bit division unit?Eric Bron11/16/12 12:24 PM
            128-bit division unit?Felid11/16/12 06:19 PM
              128-bit division unit?Eric Bron11/18/12 07:41 AM
            128-bit division unit?Michael S11/17/12 11:50 AM
              128-bit division unit?Felid11/17/12 12:44 PM
                128-bit division unit?Michael S11/17/12 01:45 PM
                  128-bit division unit?Felid11/17/12 04:49 PM
                    128-bit division unit?Michael S11/17/12 05:56 PM
              128-bit division unit?Eric Bron11/18/12 07:35 AM
  Haswell CPU article onlineJim F11/18/12 08:45 AM
    Haswell CPU article onlineGabriele Svelto11/18/12 11:52 AM
  Probable bottleneckLaurent Birtz11/23/12 12:45 PM
    Probable bottleneckEduardoS11/23/12 12:58 PM
      Probable bottleneckLaurent Birtz11/24/12 09:10 AM
    Probable bottleneckStubabe11/25/12 02:08 AM
      Probable bottleneckEduardoS11/25/12 07:15 AM
        Probable bottleneckStubabe11/28/12 03:36 PM
          Urgh. Post got mangled by LESS THAN signStubabe11/28/12 03:41 PM
          Probable bottleneckLaurent Birtz11/29/12 07:34 AM
  Haswell CPU article onlineMr. Camel11/28/12 02:47 PM
    Haswell CPU article onlineEduardoS11/28/12 03:06 PM
      Haswell CPU article onlineMr. Camel11/28/12 06:23 PM
        Haswell CPU article onlineEduardoS11/28/12 06:27 PM
          Haswell CPU article onlineMr. Camel12/12/12 12:39 PM
            Much faster iGPU clock ...Mark Roulo12/12/12 02:53 PM
              Much faster iGPU clock ...Exophase12/12/12 10:46 PM
                Much faster iGPU clock ... or not :-)Mark Roulo12/13/12 08:11 AM
                  Much faster iGPU clock ... or not :-)EduardoS12/13/12 09:38 PM
                    Much faster iGPU clock ... or not :-)Michael S12/14/12 04:33 AM
                      Much faster iGPU clock ... or not :-)EduardoS12/14/12 06:06 AM
                        Much faster iGPU clock ... or not :-)Doug S12/14/12 11:13 AM
                          Much faster iGPU clock ... or not :-)EduardoS12/14/12 11:43 AM
                  Much faster iGPU clock ... or not :-)Mr. Camel12/14/12 09:50 AM
              Much faster iGPU clock ...Michael S12/13/12 01:44 AM
                Much faster iGPU clock ...Mark Roulo12/13/12 08:09 AM
  Haswell CPU article onlineYang12/09/12 07:28 PM
    possible spam bot? (NT)I.S.T.12/10/12 02:40 PM
  CPU Crystal Well behavior w/ eGPU?Robert Williams04/17/13 01:16 PM
    CPU Crystal Well behavior w/ eGPU?Nicolas Capens04/17/13 02:30 PM
      CPU Crystal Well behavior w/ eGPU?RecessionCone04/17/13 03:20 PM
        CPU Crystal Well behavior w/ eGPU?Robert Williams04/17/13 06:37 PM
    CPU Crystal Well behavior w/ eGPU?Eric Bron04/17/13 08:10 PM
  Haswell CPU article onlineSireesh09/01/14 01:48 PM
    Haswell CPU article onlineMaynard Handley09/01/14 02:51 PM
      Great postDavid Kanter09/01/14 06:12 PM
      Thanks :)Alberto09/02/14 12:42 AM
      Thanks (NT)Poindexter09/02/14 08:31 AM
    Haswell CPU article onlineEduardoS09/01/14 03:21 PM
  Haswell CPU article onlineAlbert10/06/15 12:48 AM
    Haswell CPU article onlineMichael S10/06/15 01:10 AM
    Haswell CPU article onlineSHK10/06/15 02:51 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?