By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 21, 2012 12:26 pm
Room: Moderated Discussions
JJB (jjb@example.com) on 4/21/12 wrote:
>
>Now it makes sense! The AGUs can't affect the flags. That's
>actually a moderately sane separation, since the hazards there are so complex, but
>given the x86's flags-heavy architecture, it severely limits the instructions that can be issued.
That makes no sense. Sure, the flags are special, but they
aren't that special.
Besides, if an architect really thinks that flags access
is a major issue, any sane architect will do the trivial
WAW detection, and simply kill the redundant flags. Because
while it is true that in theory almost every single
x86 operation writes the flags, in practice the
flags register is trivially dead for the vast majority of
those same instructions, and you can see that with very
small instruction windows.
In fact, the instruction window could be small enough that
you can just do it at decode time, and if you decode more
than one instruction per cycle, you notice when an earlier
instruction eflags writing is covered by the later ones.
With three or four decoders, I bet that you'd get rid of
half the eflags writes that way.
If you do it with a bigger instruction window (and basically
do it as a register renaming thing), you get rid of almost
all of them. In fact, that's one of the main reasons to
avoid the "inc/dec" instructions, because they don't write
the full set of eflags, so you need to rename things
one bit at a time.
Sure, dropping eflags writes does affect your instruction
completion model, and that can be very complicated
depending on how you handle that whole thing. But if eflags
is a major bottleneck for your uarch, somebody did something
wrong, I think.
Linus
>
>Now it makes sense! The AGUs can't affect the flags. That's
>actually a moderately sane separation, since the hazards there are so complex, but
>given the x86's flags-heavy architecture, it severely limits the instructions that can be issued.
That makes no sense. Sure, the flags are special, but they
aren't that special.
Besides, if an architect really thinks that flags access
is a major issue, any sane architect will do the trivial
WAW detection, and simply kill the redundant flags. Because
while it is true that in theory almost every single
x86 operation writes the flags, in practice the
flags register is trivially dead for the vast majority of
those same instructions, and you can see that with very
small instruction windows.
In fact, the instruction window could be small enough that
you can just do it at decode time, and if you decode more
than one instruction per cycle, you notice when an earlier
instruction eflags writing is covered by the later ones.
With three or four decoders, I bet that you'd get rid of
half the eflags writes that way.
If you do it with a bigger instruction window (and basically
do it as a register renaming thing), you get rid of almost
all of them. In fact, that's one of the main reasons to
avoid the "inc/dec" instructions, because they don't write
the full set of eflags, so you need to rename things
one bit at a time.
Sure, dropping eflags writes does affect your instruction
completion model, and that can be very complicated
depending on how you handle that whole thing. But if eflags
is a major bottleneck for your uarch, somebody did something
wrong, I think.
Linus



