By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 21, 2012 1:11 pm
Room: Moderated Discussions
Megol (golem960@gmail.com) on 4/21/12 wrote:
>
>Intel processors are generally more sensitive to
>optimizations than AMD ones*.
I do not believe that is true any more.
It was certainly true in the old Opteron days, where
P4 was obviously very fragile indeed, and even the P6-like
cores tended to have rather clear decode limitations.
But those days are long long gone.
Intel decoders are still less symmetric than the AMD ones,
but since the uops have become more powerful, that is much
less noticeable. And Intel uarchs appear to be much
less fragile when it comes to pretty much everything else,
particularly the memory pipeline.
What particular sensitivity did you have in mind? Because
I'm very impressed with the current Intel crop of chips,
with the obvious exception of Atom, which was fragile as
hell (I say "was", because I haven't ever seen the new
Atoms, and they reportedly fix some of the stupidest forms
of it, but obviously are still in-order etc).
I used to absolutely detest the P4, because you could see
the horrible micro-faults and nasty pipeline serialization
so incredibly clearly. You could see it in benchmarks, but
it was really obvious in profiles. Doing the same
thing since Core 2+, performance profiles tend to usually
"make sense" (ie: "Oh, cache miss" or "Oops, badly predicted
branch" or "Damn, I really wish they fixed the string
instructions")
I do agree that compilers obviously tend to prefer the
most common setup by developers, and Intel does get
preferential treatment for that reason. But the compiler
rules for modern Intel CPU's aren't crazy.
Linus
>
>Intel processors are generally more sensitive to
>optimizations than AMD ones*.
I do not believe that is true any more.
It was certainly true in the old Opteron days, where
P4 was obviously very fragile indeed, and even the P6-like
cores tended to have rather clear decode limitations.
But those days are long long gone.
Intel decoders are still less symmetric than the AMD ones,
but since the uops have become more powerful, that is much
less noticeable. And Intel uarchs appear to be much
less fragile when it comes to pretty much everything else,
particularly the memory pipeline.
What particular sensitivity did you have in mind? Because
I'm very impressed with the current Intel crop of chips,
with the obvious exception of Atom, which was fragile as
hell (I say "was", because I haven't ever seen the new
Atoms, and they reportedly fix some of the stupidest forms
of it, but obviously are still in-order etc).
I used to absolutely detest the P4, because you could see
the horrible micro-faults and nasty pipeline serialization
so incredibly clearly. You could see it in benchmarks, but
it was really obvious in profiles. Doing the same
thing since Core 2+, performance profiles tend to usually
"make sense" (ie: "Oh, cache miss" or "Oops, badly predicted
branch" or "Damn, I really wish they fixed the string
instructions")
I do agree that compilers obviously tend to prefer the
most common setup by developers, and Intel does get
preferential treatment for that reason. But the compiler
rules for modern Intel CPU's aren't crazy.
Linus



