Move elimination can be a µop fusion

Article: Intel's Haswell CPU Microarchitecture
By: Stubabe (nospam.delete@this.nospam.com), November 14, 2012 12:43 pm
Room: Moderated Discussions
Paul A. Clayton (paaronclayton.delete@this.gmail.com) on November 14, 2012 6:41 am wrote:
> anon (anon.delete@this.anon.com) on November 14, 2012 5:10 am wrote:
> [snip]
> > When it is said that the front-end handles simple reg,reg moves
> > and saves OOOE resources, what does this mean exactly?
> >
> > Presumably such instruction has to be at least tracked in the ROB somehow. So it may save a physical
> > register and an execution unit, but it's not entirely eliminated from OOOE part. Or am I way off base?
>
> A move that occurs immediately before an instruction which uses the destination of the move
> as a source/destination can effectively be fused with the later instruction. E.g.:
>
> mov R9 add R10
> can be transformed into:
>
> add R10
> which could occupy a single ROB entry. (Note that each instruction need not maintain a unique
> ROB entry; ISTR that POWER4 hold up to four instructions plus a branch in each ROB entry.)
>
> Unrestricted move elimination would be more complex. However, even if such requires a ROB entry, it removes
> the move from the dependence chain--zero-cycle move--(as well as saving a register, temporary use of a scheduler
> slot, execution unit use, and possibly register file reads). Avoiding execution saves power, of course.
>
>

Surely it's just handled by the Register Alias Table?

i.e.
if PR100 (physical register) holds R9, PR90 holds R15 and PR101 is the next free reg

Without MOV elimination
-----------------------
MOV R10 <- R9
R10 would allocate PR101 and R9 renames to PR100
ADD R10, R10 R15
R10 would allocate PR102, R10 renames to PR101 and R15 to PR90

so the sequence becomes:
uMOV PR101, PR100
uADD PR102, PR101, PR90

With MOV elimination
--------------------
MOV R10 <- R9
R10 would now rename to PR100 as well as R9
ADD R10, R10 R15
R10 would allocate PR101, R10 renames to PR100 and R15 to PR90

so the sequence becomes:
uADD PR101, PR100, PR90

This way the is no need to fuse adjacent instructions, R10 can have multiple dependences and the instruction can get NOPed
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Haswell CPU article onlineDavid Kanter2012/11/13 02:43 PM
  Haswell CPU article onlineEric2012/11/13 03:10 PM
    Haswell CPU article onlinehobold2012/11/13 04:13 PM
      Haswell CPU article onlineRicardo B2012/11/13 05:09 PM
    Haswell CPU article onlineanonymou52012/11/13 04:44 PM
      Haswell CPU article onlinenone2012/11/14 02:40 AM
  Haswell CPU article onlinetarlinian2012/11/13 03:56 PM
    Fixed (NT)David Kanter2012/11/13 05:06 PM
      Haswell CPU article onlineJacob Marley2012/11/14 01:18 AM
  Haswell CPU article onlinerandomshinichi2012/11/14 01:53 AM
    LLC == Last Level Cache (usually L3) (NT)Paul A. Clayton2012/11/14 04:50 AM
    Haswell CPU article onlineJoe2012/11/14 09:38 AM
      LLC vs. L3 vs. L4David Kanter2012/11/14 10:09 AM
        LLC vs. L3 vs. L4; LLC = Link Layer ControllerRay2012/11/14 09:08 PM
          A pit there are only 17000 TLAs... (NT)EduardoS2012/11/15 02:14 AM
  Haswell CPU article onlineanon2012/11/14 04:10 AM
    Move elimination can be a µop fusionPaul A. Clayton2012/11/14 05:41 AM
      That should be "mov R10 <- R9"! (NT)Paul A. Clayton2012/11/14 05:43 AM
      Move elimination can be a µop fusionanon2012/11/14 06:25 AM
        It does avoid the scheduler (NT)Paul A. Clayton2012/11/14 07:47 AM
      Move elimination can be a µop fusionStubabe2012/11/14 12:43 PM
        Move elimination can be a µop fusionanon2012/11/14 08:33 PM
          Move elimination can be a µop fusionFelid2012/11/14 11:49 PM
            Move elimination can be a µop fusionanon2012/11/15 12:23 AM
              Move elimination can be a µop fusionStuart2012/11/15 04:04 AM
                Move elimination can be a µop fusionStubabe2012/11/15 04:14 AM
                  Move elimination can be a µop fusionanon2012/11/15 04:48 AM
                    Move elimination can be a µop fusionEduardoS2012/11/15 05:00 AM
                      Move elimination can be a µop fusionanon2012/11/15 05:14 AM
                        Move elimination can be a µop fusionEduardoS2012/11/15 05:21 AM
                          Move elimination can be a µop fusionanon2012/11/15 05:31 AM
                    Move elimination can be a µop fusionStubabe2012/11/15 10:38 AM
                      There can be only one dependencePaul A. Clayton2012/11/15 11:50 AM
                    Move elimination can be a µop fusionFelid2012/11/15 02:19 PM
                      Move elimination can be a µop fusionanon2012/11/16 03:07 AM
                        Move elimination can be a µop fusionFelid2012/11/16 06:43 PM
                  Move elimination can be a µop fusionFelid2012/11/15 01:50 PM
                    Move elimination can be a µop fusionFelid2012/11/15 02:03 PM
                      Correction!Felid2012/11/19 12:23 AM
                    Thanks, I wasn't aware of the change in SB. Good to know... (NT)Stubabe2012/11/15 02:43 PM
            Move fusion assumes adjacencyPaul A. Clayton2012/11/15 06:15 AM
              Move fusion assumes adjacencyFelid2012/11/15 01:40 PM
        Move elimination can be a µop fusionPatrick Chase2012/11/21 10:52 AM
          Move elimination can be a µop fusionPatrick Chase2012/11/21 11:12 AM
    Haswell CPU article onlineRicardo B2012/11/14 08:12 AM
  Haswell CPU article onlinegmb2012/11/14 07:28 AM
  Haswell CPU article onlineFelid2012/11/14 10:58 PM
    Haswell CPU article onlineDavid Kanter2012/11/15 08:59 AM
      Haswell CPU article onlineFelid2012/11/15 01:15 PM
        Instruction queueDavid Kanter2012/11/16 11:23 AM
          Instruction queueFelid2012/11/16 12:05 PM
  128-bit division unit?Eric Bron2012/11/16 03:57 AM
    128-bit division unit?David Kanter2012/11/16 07:59 AM
      128-bit division unit?Eric Bron2012/11/16 08:47 AM
        128-bit division unit?Felid2012/11/16 11:46 AM
          128-bit division unit?Eric Bron2012/11/16 12:24 PM
            128-bit division unit?Felid2012/11/16 06:19 PM
              128-bit division unit?Eric Bron2012/11/18 07:41 AM
            128-bit division unit?Michael S2012/11/17 11:50 AM
              128-bit division unit?Felid2012/11/17 12:44 PM
                128-bit division unit?Michael S2012/11/17 01:45 PM
                  128-bit division unit?Felid2012/11/17 04:49 PM
                    128-bit division unit?Michael S2012/11/17 05:56 PM
              128-bit division unit?Eric Bron2012/11/18 07:35 AM
  Haswell CPU article onlineJim F2012/11/18 08:45 AM
    Haswell CPU article onlineGabriele Svelto2012/11/18 11:52 AM
  Probable bottleneckLaurent Birtz2012/11/23 12:45 PM
    Probable bottleneckEduardoS2012/11/23 12:58 PM
      Probable bottleneckLaurent Birtz2012/11/24 09:10 AM
    Probable bottleneckStubabe2012/11/25 02:08 AM
      Probable bottleneckEduardoS2012/11/25 07:15 AM
        Probable bottleneckStubabe2012/11/28 03:36 PM
          Urgh. Post got mangled by LESS THAN signStubabe2012/11/28 03:41 PM
          Probable bottleneckLaurent Birtz2012/11/29 07:34 AM
  Haswell CPU article onlineMr. Camel2012/11/28 02:47 PM
    Haswell CPU article onlineEduardoS2012/11/28 03:06 PM
      Haswell CPU article onlineMr. Camel2012/11/28 06:23 PM
        Haswell CPU article onlineEduardoS2012/11/28 06:27 PM
          Haswell CPU article onlineMr. Camel2012/12/12 12:39 PM
            Much faster iGPU clock ...Mark Roulo2012/12/12 02:53 PM
              Much faster iGPU clock ...Exophase2012/12/12 10:46 PM
                Much faster iGPU clock ... or not :-)Mark Roulo2012/12/13 08:11 AM
                  Much faster iGPU clock ... or not :-)EduardoS2012/12/13 09:38 PM
                    Much faster iGPU clock ... or not :-)Michael S2012/12/14 04:33 AM
                      Much faster iGPU clock ... or not :-)EduardoS2012/12/14 06:06 AM
                        Much faster iGPU clock ... or not :-)Doug S2012/12/14 11:13 AM
                          Much faster iGPU clock ... or not :-)EduardoS2012/12/14 11:43 AM
                  Much faster iGPU clock ... or not :-)Mr. Camel2012/12/14 09:50 AM
              Much faster iGPU clock ...Michael S2012/12/13 01:44 AM
                Much faster iGPU clock ...Mark Roulo2012/12/13 08:09 AM
  Haswell CPU article onlineYang2012/12/09 07:28 PM
    possible spam bot? (NT)I.S.T.2012/12/10 02:40 PM
  CPU Crystal Well behavior w/ eGPU?Robert Williams2013/04/17 01:16 PM
    CPU Crystal Well behavior w/ eGPU?Nicolas Capens2013/04/17 02:30 PM
      CPU Crystal Well behavior w/ eGPU?RecessionCone2013/04/17 03:20 PM
        CPU Crystal Well behavior w/ eGPU?Robert Williams2013/04/17 06:37 PM
    CPU Crystal Well behavior w/ eGPU?Eric Bron2013/04/17 08:10 PM
  Haswell CPU article onlineSireesh2014/09/01 01:48 PM
    Haswell CPU article onlineMaynard Handley2014/09/01 02:51 PM
      Great postDavid Kanter2014/09/01 06:12 PM
      Thanks :)Alberto2014/09/02 12:42 AM
      Thanks (NT)Poindexter2014/09/02 08:31 AM
    Haswell CPU article onlineEduardoS2014/09/01 03:21 PM
  Haswell CPU article onlineAlbert2015/10/06 12:48 AM
    Haswell CPU article onlineMichael S2015/10/06 01:10 AM
    Haswell CPU article onlineSHK2015/10/06 02:51 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?