CMOV is 3 operand given register renaming

Article: AMD's Jaguar Microarchitecture
By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), April 2, 2014 9:11 am
Room: Moderated Discussions
SHK (nomail.delete@this.mail.com) on April 2, 2014 6:45 am wrote:
[snip]
> Yes, in all the AMD processors i'm aware cmovc has 1 cycle latency, which
> is the "natural" latency for a cmov/select type of instruction.
> What i find strange is in that all Intel's cpu (except P4, which was way worse) cmovcc
> is 2 cycles. Agner's explaination is that all the ins with more than 2 sources are cracked
> in two u-ops and cmov has 3 sources (something like dst:=select(src1,src2,eflags))
> which seems reasonable, but doesn't explain why AMD's cpu doesn't have this problem.
> My 2cent is that cmovcc could be implemented with only 2 reg sources (dst:=cmov(src1,eflags))
> and if the condition fails is turned to a no-op. But i've no proof of this.

With ordinary register renaming using a Register Alias Table, this would not be possible since dependent operations need to read the name of the destination register. If the old name is kept in the RAT (to support no-op), then if the move is performed all dependent operations would have the wrong name; if a new name is inserted (like other instructions with a register destination) and the move is not performed, all dependent operations would have the wrong name.

Renaming using a priority CAM would avoid this problem. While this kind of design scales very poorly for general renaming, it might not be unthinkable for just conditional moves. However, I suspect that such an irregularity would not be worthwhile (in terms of complexity/area/power compared to benefit).

Special casing the handling of flags might allow some simplification of issuing conditional operations.

Prediction of which of three operands will not be the last available would also allow a two-source issue queue to handle such operations. I suspect that the common case for conditional move and add with carry is for the old value to be available no later than the later of flag and other operand so misprediction would not be common.

Using virtual physical register renaming (where the name in the RAT can be a non-physical register name which is then translated a second time) would allow conditional operations to be treated as no-ops when the condition fails. However, making one condition a no-op removes the result forwarding benefit that would come from reading both old and alternative values for the conditional move instruction, requiring dependent operations to read the register file. Also, without ordinary result forwarding, instruction issue would become more complex since only in this special case is the register to read not known until after the instruction provide input operands has completed. (Virtual physical registers were conceived to reduce the need for physical registers since renamed but uncompleted operations would not need physical registers. It can also be used to exploit banking in the register file since bank selection for writes can be done at instruction completion and so generally avoid bank conflicts.)

Agner Fog indicates that since AMD uses macro-operations, cmove is only one operation: "A macro-operation can have any number of input dependencies. This means that instructions with more than two input dependencies, such as MOV [EAX+EBX],ECX, ADC EAX,EBX and CMOVBE EAX,EBX, generate only one macro-operation, while they require two micro-operations on Intel processors." (p. 163, The microarchitecture of Intel, AMD and VIA CPUs, 2013-09-04 version)

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: AMD's Jaguar MicroarchitectureDavid Kanter2014/04/01 01:19 AM
  New article: AMD's Jaguar MicroarchitectureSHK2014/04/01 06:09 AM
    New article: AMD's Jaguar MicroarchitectureJeff Rupley2014/04/01 07:13 PM
      New article: AMD's Jaguar MicroarchitectureSHK2014/04/02 06:45 AM
        CMOV is 3 operand given register renamingPaul A. Clayton2014/04/02 09:11 AM
          CMOV is 3 operand given register renamingSHK2014/04/02 12:17 PM
            Limited operand tags in issue queue entriesPaul A. Clayton2014/04/02 01:32 PM
        New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 12:48 PM
          New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 02:32 PM
  New article: AMD's Jaguar MicroarchitectureGeorge2014/04/01 02:10 PM
  New article: AMD's Jaguar Microarchitecturewillmore2014/04/01 06:37 PM
    New article: AMD's Jaguar Microarchitecturewillmore2014/04/01 07:08 PM
    New article: AMD's Jaguar MicroarchitectureNaN2014/04/02 08:58 AM
      New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/04 07:16 AM
        New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 08:54 AM
          New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/04 11:45 AM
            New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 02:00 PM
              New article: AMD's Jaguar MicroarchitectureNoSpammer2014/04/04 03:15 PM
              New article: AMD's Jaguar MicroarchitectureTREZA2014/04/04 03:18 PM
                New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 04:56 PM
                  New article: AMD's Jaguar MicroarchitectureTREZA2014/04/04 05:34 PM
                  New article: AMD's Jaguar MicroarchitectureMichael S2014/04/05 11:02 AM
                  New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/05 06:50 PM
                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 01:22 AM
                    New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 05:29 AM
                      New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/06 07:33 AM
                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 03:12 AM
                          New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 06:58 AM
                        New article: AMD's Jaguar MicroarchitectureEduardoS2014/04/07 04:34 PM
                      New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/06 07:53 AM
                      New article: AMD's Jaguar MicroarchitectureMegol2014/04/06 08:21 AM
                        New article: AMD's Jaguar Microarchitecturenone2014/04/06 09:07 AM
                          New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 09:23 AM
                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 02:48 PM
                          New article: AMD's Jaguar MicroarchitectureTREZA2014/04/06 03:47 PM
                            New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 02:34 AM
                              New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 03:27 AM
                                New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 05:39 AM
                                  New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/07 12:26 PM
                                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 01:42 PM
                                    New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 01:50 PM
                                      New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/07 02:11 PM
                                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 05:44 PM
                                      New article: AMD's Jaguar MicroarchitectureTREZA2014/04/07 03:38 PM
              denormal on IvyB and HaswellMichael S2014/04/05 10:45 AM
                Forum searchiz2014/04/05 12:54 PM
                denormal on IvyB and HaswellLinus Torvalds2014/04/06 09:55 AM
                  denormal on IvyB and HaswellMichael S2014/04/17 06:43 PM
            New article: AMD's Jaguar Microarchitecturedmcq2014/04/05 06:52 AM
            New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/05 10:38 AM
              New article: AMD's Jaguar MicroarchitectureMichael S2014/04/05 10:59 AM
                New article: AMD's Jaguar MicroarchitectureBrett2014/04/05 12:12 PM
                  New article: AMD's Jaguar MicroarchitectureEduardoS2014/04/05 12:29 PM
                    New article: AMD's Jaguar MicroarchitectureBrett2014/04/05 01:00 PM
                      New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 02:18 AM
                        New article: AMD's Jaguar MicroarchitectureBrett2014/04/06 10:08 AM
                          New article: AMD's Jaguar MicroarchitectureBrett2014/04/06 10:11 AM
                New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/05 06:01 PM
                  New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 01:50 AM
                    New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/06 03:52 PM
                      New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 02:20 AM
                        New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/07 10:38 AM
                          New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 10:47 AM
                            New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/07 02:52 PM
                              New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 04:01 PM
                                New article: AMD's Jaguar MicroarchitectureSeni2014/04/08 02:03 PM
                                  New article: AMD's Jaguar MicroarchitectureWilco2014/04/08 02:56 PM
                                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/08 04:05 PM
                                      New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/08 06:55 PM
                                        New article: AMD's Jaguar MicroarchitectureMichael S2014/04/09 01:12 AM
                  New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 04:51 AM
  New article: AMD's Jaguar MicroarchitectureWaltC2014/04/02 01:52 PM
    New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 02:25 PM
      New article: AMD's Jaguar Microarchitectureitsmydamnation2014/04/03 12:19 AM
      New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/09 01:44 PM
        New article: AMD's Jaguar MicroarchitectureDavid Kanter2014/04/10 11:24 PM
          New article: AMD's Jaguar Microarchitecturenone2014/04/11 01:49 AM
          New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/11 09:14 AM
    New article: AMD's Jaguar MicroarchitectureRyan Dean2014/04/03 01:04 AM
  New article: AMD's Jaguar MicroarchitecturePaul A. Clayton2014/04/02 05:02 PM
  New article: AMD's Jaguar MicroarchitectureRicky Chan2014/04/03 07:50 AM
    New article: AMD's Jaguar Microarchitecturesomeone2014/04/04 07:18 AM
  New article: AMD's Jaguar Microarchitecturebakaneko2014/04/09 03:08 PM
    New article: AMD's Jaguar MicroarchitectureTREZA2014/04/09 05:34 PM
  Jaguar's detailsHugo DĂ©charnes2014/06/07 04:08 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?