Article: AMD's Mobile Strategy
By: David Kanter (dkanter.delete@this.realworldtech.com), December 16, 2011 4:19 am
Room: Moderated Discussions
Wilco (Wilco.Dijkstra@ntlworld.com) on 12/16/11 wrote:
---------------------------
>Linus Torvalds (torvalds@linux-foundation.org) on 12/15/11 wrote:
>---------------------------
>>Wilco (Wilco.Dijkstra@ntlworld.com) on 12/15/11 wrote:
>>>
>>>I disagree. The impact on the high-end is smaller nowadays, eventhough it remains
>>>non-trivial. Nobody would claim that 4-way x86 decode is easy! It has taken a very
>>>long time for x86 to get there, when 3rd generation OoO ARM is already going to be 4-way.
>>
>>That's a total red herring.
>>
>>x86 instructions do more.
>
>No, on average they do less.
Do you have any evidence for this?
I'm extremely skeptical of any claims that ARM is a semantically richer ISA than x86. The load-op aspect alone nearly guarantees that x86 will require fewer dynamic instructions.
>Nobody cares what the most complex instructions
>can do if they never get used by the compiler. And x86 >instructions are just not
>a good fit for the code that most people write.
REP MOV and most load-op instructions are used by the compiler.
>>Doing a two-way x86 decode is not rocket science, and has
>>been done for a long time. And it's likely not all that
>>different from four-way ARM that is not even done yet.
>>
>>Or look at the old-style Intel three-way instruction decoder
>>(3-1-1) that could decode three simple instructions.
>>
>>No, generic 4-way x86 decode isn't simple, but it's a hell
>>of a lot more than 4 ARM instructions. So your comparison
>>simply makes no sense!
>
>Correction: a lot more complex.
>>Those x86 addressing modes are powerful and used. And they
>>regularly replace two or more ARM instructions. Just
>>do the math: ARM code isn't actually all that much denser
>>even in Thumb, yet x86 instructions are rather longer on
>>average.
>
>Show an example where an x86 addressing mode replaces 2 or >more ARM instructions.
>>You can think of it this way: all those embedded constants
>>and addressing modes are all just "simple instructions".
>>On ARM, they are explicit instructions, on x86 they are
>>"microinstructions" embedded in a "macroinstruction".
>>
>>So don't compare one ARM instruction to one x86 instruction.
>>They are very different. An ARM instruction is closer to
>>the old-style uops (and by "old-style" I mean the ones that
>>Intel used to produce that didn't have read-modify-write
>>versions: the uops in Core 2+ are rather closer to the
>>real x86 instructions).
>
>That's not true at all. How many x86 micro ops do you need to execute this single ARM instruction:
>
>addeq r0,r1,r2,lsl #2
>
>>And that's not even taking things like constants into
>>account. Something that will only get worse for
>>ARM as it starts going 64-bit.
>
>Really? Immediates are not really a problem on ARM, and 64-bit doesn't make it
>any worse. In fact, without even having seen the ISA or the ABI, I expect the way
>global variables and constants are dealt with is much >improved over how it works today.
ADDEQ won't exist in 64-bit ARM AFAICT.
>>So comparing 4-way x86 to 4-way ARM is ridiculous.
>
>Indeed, ARM instructions are so much more powerful that you need 64-way x86 decode to keep up with 4-way ARM...
>
>Seriously, I'd expected better from you.
I am incredibly skeptical of such claims without real data to back it up.
David
---------------------------
>Linus Torvalds (torvalds@linux-foundation.org) on 12/15/11 wrote:
>---------------------------
>>Wilco (Wilco.Dijkstra@ntlworld.com) on 12/15/11 wrote:
>>>
>>>I disagree. The impact on the high-end is smaller nowadays, eventhough it remains
>>>non-trivial. Nobody would claim that 4-way x86 decode is easy! It has taken a very
>>>long time for x86 to get there, when 3rd generation OoO ARM is already going to be 4-way.
>>
>>That's a total red herring.
>>
>>x86 instructions do more.
>
>No, on average they do less.
Do you have any evidence for this?
I'm extremely skeptical of any claims that ARM is a semantically richer ISA than x86. The load-op aspect alone nearly guarantees that x86 will require fewer dynamic instructions.
>Nobody cares what the most complex instructions
>can do if they never get used by the compiler. And x86 >instructions are just not
>a good fit for the code that most people write.
REP MOV and most load-op instructions are used by the compiler.
>>Doing a two-way x86 decode is not rocket science, and has
>>been done for a long time. And it's likely not all that
>>different from four-way ARM that is not even done yet.
>>
>>Or look at the old-style Intel three-way instruction decoder
>>(3-1-1) that could decode three simple instructions.
>>
>>No, generic 4-way x86 decode isn't simple, but it's a hell
>>of a lot more than 4 ARM instructions. So your comparison
>>simply makes no sense!
>
>Correction: a lot more complex.
>>Those x86 addressing modes are powerful and used. And they
>>regularly replace two or more ARM instructions. Just
>>do the math: ARM code isn't actually all that much denser
>>even in Thumb, yet x86 instructions are rather longer on
>>average.
>
>Show an example where an x86 addressing mode replaces 2 or >more ARM instructions.
>>You can think of it this way: all those embedded constants
>>and addressing modes are all just "simple instructions".
>>On ARM, they are explicit instructions, on x86 they are
>>"microinstructions" embedded in a "macroinstruction".
>>
>>So don't compare one ARM instruction to one x86 instruction.
>>They are very different. An ARM instruction is closer to
>>the old-style uops (and by "old-style" I mean the ones that
>>Intel used to produce that didn't have read-modify-write
>>versions: the uops in Core 2+ are rather closer to the
>>real x86 instructions).
>
>That's not true at all. How many x86 micro ops do you need to execute this single ARM instruction:
>
>addeq r0,r1,r2,lsl #2
>
>>And that's not even taking things like constants into
>>account. Something that will only get worse for
>>ARM as it starts going 64-bit.
>
>Really? Immediates are not really a problem on ARM, and 64-bit doesn't make it
>any worse. In fact, without even having seen the ISA or the ABI, I expect the way
>global variables and constants are dealt with is much >improved over how it works today.
ADDEQ won't exist in 64-bit ARM AFAICT.
>>So comparing 4-way x86 to 4-way ARM is ridiculous.
>
>Indeed, ARM instructions are so much more powerful that you need 64-way x86 decode to keep up with 4-way ARM...
>
>Seriously, I'd expected better from you.
I am incredibly skeptical of such claims without real data to back it up.
David