Article: AMD's Mobile Strategy
By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), December 16, 2011 12:27 am
Room: Moderated Discussions
anon (anon@anon.com) on 12/15/11 wrote:
---------------------------
>No. A 4-wide decode of a "nice" ISA is going to be far easier than a 4-wide decode
>of x86, regardless of the exact expressiveness of the instructions being decoded.
>I think that was the point of Wilco's remark about decoding, rather than somehow
>being a statement that x86 instruction is exactly identical to ARM instruction that people seemed to have taken it as.
>
>(He did go on to talk about efficiency of ARM etc, but that did not seem to be
>a conclusion he attempted to draw from exactly the fact that 4 wide decode is equivalent,
>just that x86 has high decode overhead, which it does).
Well I didn't mention anything about the expressiveness of the instructions, as it hardly matters in reality. You can execute simple instructions in a single cycle or complex ones over multiple cycles - overall it's a wash and more a codesize issue than anything else. But now the cat is out of the bag, the actual instructions executed happen to match fairly well in terms of the amount of work done on average. Obviously this is partially due to compilers using mostly simple x86 instructions and partially due to ARM having quite powerful instructions. The fact that the semantics on a per instruction basis are wildly different is irrelevant, it's the average that is interesting.
Wilco
---------------------------
>No. A 4-wide decode of a "nice" ISA is going to be far easier than a 4-wide decode
>of x86, regardless of the exact expressiveness of the instructions being decoded.
>I think that was the point of Wilco's remark about decoding, rather than somehow
>being a statement that x86 instruction is exactly identical to ARM instruction that people seemed to have taken it as.
>
>(He did go on to talk about efficiency of ARM etc, but that did not seem to be
>a conclusion he attempted to draw from exactly the fact that 4 wide decode is equivalent,
>just that x86 has high decode overhead, which it does).
Well I didn't mention anything about the expressiveness of the instructions, as it hardly matters in reality. You can execute simple instructions in a single cycle or complex ones over multiple cycles - overall it's a wash and more a codesize issue than anything else. But now the cat is out of the bag, the actual instructions executed happen to match fairly well in terms of the amount of work done on average. Obviously this is partially due to compilers using mostly simple x86 instructions and partially due to ARM having quite powerful instructions. The fact that the semantics on a per instruction basis are wildly different is irrelevant, it's the average that is interesting.
Wilco