Article: AMD's Mobile Strategy
By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), December 18, 2011 4:21 pm
Room: Moderated Discussions
Michael S (already5chosen@yahoo.com) on 12/18/11 wrote:
>
>So I took older manual 24896609.pdf. Don't remember the exact date, but likely 2003 Q2 or Q3.
>
>The only suggestion that remotely reminds what you're sayin is this one: "Avoid
>complex instructions that require more than 4 uops". Obviously, it has nothing to
>do with our discussion, because complex instructions we are talking about consist
>of either 2 uops (load-op) or 4 uops (read-modyfy-write).
Hmm.. You're right, I can't find it. But it was pretty
generic knowledge at the time that you should avoid the
"load-op" instructions and complex addressing modes. I
know I'm not the only one who has that memory (see just
this thread).
But maybe this is one of those anecdotal stories that just
grew in the telling.
Or maybe it's from the horrible P4 "load after store"
behavior that just made people really worried about the
memop instructions. On many other x86 uarchs it's
reasonably ok to keep things in memory and then operate on
it with r-m-w cycles repeatedly: on the P4 there are some
nasty load-after-store behavior cases that can trigger
delayes of tens of cycles if you load from a location that
was stored to a few cycles earlier.
Linus
>
>So I took older manual 24896609.pdf. Don't remember the exact date, but likely 2003 Q2 or Q3.
>
>The only suggestion that remotely reminds what you're sayin is this one: "Avoid
>complex instructions that require more than 4 uops". Obviously, it has nothing to
>do with our discussion, because complex instructions we are talking about consist
>of either 2 uops (load-op) or 4 uops (read-modyfy-write).
Hmm.. You're right, I can't find it. But it was pretty
generic knowledge at the time that you should avoid the
"load-op" instructions and complex addressing modes. I
know I'm not the only one who has that memory (see just
this thread).
But maybe this is one of those anecdotal stories that just
grew in the telling.
Or maybe it's from the horrible P4 "load after store"
behavior that just made people really worried about the
memop instructions. On many other x86 uarchs it's
reasonably ok to keep things in memory and then operate on
it with r-m-w cycles repeatedly: on the P4 there are some
nasty load-after-store behavior cases that can trigger
delayes of tens of cycles if you load from a location that
was stored to a few cycles earlier.
Linus