Article: AMD's Mobile Strategy
By: Exophase (exophase.delete@this.gmail.com), December 21, 2011 9:28 am
Room: Moderated Discussions
Seni (seniike@hotmail.com) on 12/21/11 wrote:
---------------------------
>The most interesting addressing modes are a 12-bit displacement mode
>Load dest = [base + imm12]
>and "register offset" which is
>Load dest = [base + index << scale]
>
Those 12-bit offsets are scaled though, which makes a big difference since the majority of memory accesses are going to be 32 or 64-bit (at least in what profiling I've done in the past).
>Compare to x86's SIB mode
>Load dest = [base + index << scale + imm64]
>
>The ARM64 version has more choices of scale.
>
Generally only imm32 (or imm8) is available. The only x86-64 instructions available with 64-bit displacements are absolute loads or stores to al/ax/eax/rax. Generally
>The x86 version combines not only the AGU op and Load, but also up to 1 ALU op,
>and the loading and adding in of a full-length immediate.
I actually think that store immediate is one of the more useful instructions that x86 has over ARM.
>
>So for example, the x86-64 instruction
>ADD RAX, [RBX + RSI + imm64]
>
>would expand to a six instruction ARM64 counterpart something like this:
>MOVZ X1, imm16
>MOVK X1, imm16, 16
>MOVK X1, imm16, 32
>MOVK X1, imm16, 48
>LDR X1, [X1, X3]
>ADD X2, X1, X2
Well yeah, if this x86 instruction existed.
>Also there's this:
>"The LDM, STM, PUSH and POP instructions do not exist in A64"
>
>So, it looks like ARM64 is RISCier than ARM32, and doesn't have much in the way of big multi-op instructions.
LDM/STM was the only big multi-op instruction. ARM64 removes it but instead has load/store pair which is a decent compromise for saving instructions for register save/restore. This is also consistent with ARM's last few uarch decisions, where ldm/stm had 2x the peak bandwidth to L1 compared to ldr/str.. they probably want to still provide for this sort of direct utilization.
Please note that the push and pop instructions here only refer to those accessing multiple registers. Pre/post increment w/writeback is still offered with loads/stores which is all push/pop was ever an alias for (ARM64 allows SP to be accessed for load/store). fwiw my disassembler changes my str reg, [ sp ], #-4 to "push."
---------------------------
>The most interesting addressing modes are a 12-bit displacement mode
>Load dest = [base + imm12]
>and "register offset" which is
>Load dest = [base + index << scale]
>
Those 12-bit offsets are scaled though, which makes a big difference since the majority of memory accesses are going to be 32 or 64-bit (at least in what profiling I've done in the past).
>Compare to x86's SIB mode
>Load dest = [base + index << scale + imm64]
>
>The ARM64 version has more choices of scale.
>
Generally only imm32 (or imm8) is available. The only x86-64 instructions available with 64-bit displacements are absolute loads or stores to al/ax/eax/rax. Generally
>The x86 version combines not only the AGU op and Load, but also up to 1 ALU op,
>and the loading and adding in of a full-length immediate.
I actually think that store immediate is one of the more useful instructions that x86 has over ARM.
>
>So for example, the x86-64 instruction
>ADD RAX, [RBX + RSI + imm64]
>
>would expand to a six instruction ARM64 counterpart something like this:
>MOVZ X1, imm16
>MOVK X1, imm16, 16
>MOVK X1, imm16, 32
>MOVK X1, imm16, 48
>LDR X1, [X1, X3]
>ADD X2, X1, X2
Well yeah, if this x86 instruction existed.
>Also there's this:
>"The LDM, STM, PUSH and POP instructions do not exist in A64"
>
>So, it looks like ARM64 is RISCier than ARM32, and doesn't have much in the way of big multi-op instructions.
LDM/STM was the only big multi-op instruction. ARM64 removes it but instead has load/store pair which is a decent compromise for saving instructions for register save/restore. This is also consistent with ARM's last few uarch decisions, where ldm/stm had 2x the peak bandwidth to L1 compared to ldr/str.. they probably want to still provide for this sort of direct utilization.
Please note that the push and pop instructions here only refer to those accessing multiple registers. Pre/post increment w/writeback is still offered with loads/stores which is all push/pop was ever an alias for (ARM64 allows SP to be accessed for load/store). fwiw my disassembler changes my str reg, [ sp ], #-4 to "push."