By: ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com), June 8, 2022 8:07 am
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on June 7, 2022 9:58 pm wrote:
> Brett (ggtgp.delete@this.yahoo.com) on June 7, 2022 6:04 pm wrote:
> > Add/sub from memory would be the biggest gain in code size reduction combining instructions, and x86 has it.
>
> The problem with add/sub in x86 is that it's destructive, a value in register was loaded there likely because
> it was going to be reused, by destroying it add reg, [mem] can only be used when it's the last usage of
> reg, in the case of add [mem], reg it is used, but the nature of this sequence make it rare anyway.
Register-register MOV instructions in some, but not in all, recent x86 designs are executed early (are executed after µop dispatch stage and before µop scheduling stage), so the concrete sequence "MOV reg2 := reg1; ADD reg2 += [mem]" should have the same effective latency as the hypothetical instruction "ADD reg2 := reg1 + [mem]" if the two concrete instructions are dispatched in the same clock cycle.
As far as I know, there is no evidence which would be suggesting that there exists an "ADD reg2 := reg1 + [mem]" µop in AMD/Intel CPUs.
-atom
> Brett (ggtgp.delete@this.yahoo.com) on June 7, 2022 6:04 pm wrote:
> > Add/sub from memory would be the biggest gain in code size reduction combining instructions, and x86 has it.
>
> The problem with add/sub in x86 is that it's destructive, a value in register was loaded there likely because
> it was going to be reused, by destroying it add reg, [mem] can only be used when it's the last usage of
> reg, in the case of add [mem], reg it is used, but the nature of this sequence make it rare anyway.
Register-register MOV instructions in some, but not in all, recent x86 designs are executed early (are executed after µop dispatch stage and before µop scheduling stage), so the concrete sequence "MOV reg2 := reg1; ADD reg2 += [mem]" should have the same effective latency as the hypothetical instruction "ADD reg2 := reg1 + [mem]" if the two concrete instructions are dispatched in the same clock cycle.
As far as I know, there is no evidence which would be suggesting that there exists an "ADD reg2 := reg1 + [mem]" µop in AMD/Intel CPUs.
-atom