By: ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com), June 8, 2022 12:07 pm
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on June 8, 2022 10:25 am wrote:
> ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on June 8, 2022 8:07 am wrote:
> > Register-register MOV instructions in some, but not in all, recent x86 designs are executed early (are
> > executed after µop dispatch stage and before µop scheduling stage), so the concrete sequence "MOV reg2
> > := reg1; ADD reg2 += [mem]" should have the same effective latency as the hypothetical instruction "ADD
> > reg2 := reg1 + [mem]" if the two concrete instructions are dispatched in the same clock cycle.
> >
> > As far as I know, there is no evidence which would be suggesting that
> > there exists an "ADD reg2 := reg1 + [mem]" µop in AMD/Intel CPUs.
>
> None of the above matter, the question here is about code density, not about execution.
Just some notes:
- The µop "ADD reg2 := reg1 + [mem]" would improve density in this case.
- If an instruction set is 37 years old and binary-compatible throughout 37 years then the ISA described in CPU manuals is an approximation of the CPU's internal ISA.
- The density of x86 (both 32-bit and 64-bit) assembly code is at least 50% lower compared to what is achievable. If you are interested in very high code density, x86 isn't a good example.
-atom
> ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on June 8, 2022 8:07 am wrote:
> > Register-register MOV instructions in some, but not in all, recent x86 designs are executed early (are
> > executed after µop dispatch stage and before µop scheduling stage), so the concrete sequence "MOV reg2
> > := reg1; ADD reg2 += [mem]" should have the same effective latency as the hypothetical instruction "ADD
> > reg2 := reg1 + [mem]" if the two concrete instructions are dispatched in the same clock cycle.
> >
> > As far as I know, there is no evidence which would be suggesting that
> > there exists an "ADD reg2 := reg1 + [mem]" µop in AMD/Intel CPUs.
>
> None of the above matter, the question here is about code density, not about execution.
Just some notes:
- The µop "ADD reg2 := reg1 + [mem]" would improve density in this case.
- If an instruction set is 37 years old and binary-compatible throughout 37 years then the ISA described in CPU manuals is an approximation of the CPU's internal ISA.
- The density of x86 (both 32-bit and 64-bit) assembly code is at least 50% lower compared to what is achievable. If you are interested in very high code density, x86 isn't a good example.
-atom