Article: AMD's Mobile Strategy
By: Exophase (exophase.delete@this.gmail.com), December 22, 2011 8:56 am
Room: Moderated Discussions
Wilco (Wilco.Dijkstra@ntlworld.com) on 12/22/11 wrote:
---------------------------
>No I'm talking about instruction counts. Did you take cbz, tbb, ldrd/strd or the
>improved addressing of Thumb-2 into account?
>
>Conditional execution doesn't help as much on the latest micro architectures due
>to improved branch prediction. So it is used more sparingly nowadays (it is switched
>off in the compiler I'm currently working on). It does still help a lot, on the
>A9 a single conditional move speeds up a Spec benchmark by a few percent.
>
>Wilco
I find the claim that Thumb-2 binaries use FAR fewer instructions than ARM ones very difficult to believe. ARM themselves have published numbers showing Thumb-2 compiled code to be very very slightly slower than ARM compiled code, and I know the two don't have an exact correlation but in this case I'd expect at least something. Do you have any data on this?
cbz/cbnz are difficult to utilize because they have such a tiny displacement, plus they're restricted to lower 8 registers. tbb/tbh does the same thing as ldr pc if you can use absolute addresses, it just uses a smaller jump table. ldrd/strd are in ARM.
By improved addressing I take it you mean flat 12-bit imm for add/sub and the different more mask-based imm format for everything else. I do think this will be an improvement in general, but it's hard to say exactly how much of one in general, and how much compared to what you lose.
And you can't reassemble ARM into Thumb-2 unless you're using UAL which is deliberately a subset of both. The assembler wouldn't have any transparent way to deal with inlined shifts by registers or unsupported immediate forms on some instructions. Believe me, I was bit by this when I tried re-targeting my code.
---------------------------
>No I'm talking about instruction counts. Did you take cbz, tbb, ldrd/strd or the
>improved addressing of Thumb-2 into account?
>
>Conditional execution doesn't help as much on the latest micro architectures due
>to improved branch prediction. So it is used more sparingly nowadays (it is switched
>off in the compiler I'm currently working on). It does still help a lot, on the
>A9 a single conditional move speeds up a Spec benchmark by a few percent.
>
>Wilco
I find the claim that Thumb-2 binaries use FAR fewer instructions than ARM ones very difficult to believe. ARM themselves have published numbers showing Thumb-2 compiled code to be very very slightly slower than ARM compiled code, and I know the two don't have an exact correlation but in this case I'd expect at least something. Do you have any data on this?
cbz/cbnz are difficult to utilize because they have such a tiny displacement, plus they're restricted to lower 8 registers. tbb/tbh does the same thing as ldr pc if you can use absolute addresses, it just uses a smaller jump table. ldrd/strd are in ARM.
By improved addressing I take it you mean flat 12-bit imm for add/sub and the different more mask-based imm format for everything else. I do think this will be an improvement in general, but it's hard to say exactly how much of one in general, and how much compared to what you lose.
And you can't reassemble ARM into Thumb-2 unless you're using UAL which is deliberately a subset of both. The assembler wouldn't have any transparent way to deal with inlined shifts by registers or unsupported immediate forms on some instructions. Believe me, I was bit by this when I tried re-targeting my code.