By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), November 15, 2012 7:15 am
Room: Moderated Discussions
Felid (Felid.delete@this.mailinator.com) on November 15, 2012 12:49 am wrote:
[snip]
> It doesn't makes sense. There can be many reads of mov's destination, so every one on these
> mops should get their source register replaced with (link to) original. This can't be done
> with fusion (2 instructions —> 1 mop), but perfectly apply to renaming logic.
If the fused operations are adjacent, there can be no additional uses of the mov's destination (given typical destructive [source and destination the same] x86 instructions). This, of course, means that preserving a register value by moving it to another location that is used much later would not allow this optimization, but that practice has been suboptimal for a while since one generally wants to exploit result forwarding.
Even with move elimination in the renamer (which allows more cases to be handled), doing limited move elimination in the decoder can be beneficial (especially if one has a µop cache).
[snip]
> It doesn't makes sense. There can be many reads of mov's destination, so every one on these
> mops should get their source register replaced with (link to) original. This can't be done
> with fusion (2 instructions —> 1 mop), but perfectly apply to renaming logic.
If the fused operations are adjacent, there can be no additional uses of the mov's destination (given typical destructive [source and destination the same] x86 instructions). This, of course, means that preserving a register value by moving it to another location that is used much later would not allow this optimization, but that practice has been suboptimal for a while since one generally wants to exploit result forwarding.
Even with move elimination in the renamer (which allows more cases to be handled), doing limited move elimination in the decoder can be beneficial (especially if one has a µop cache).



