By: Patrick Chase (patrickjchase.delete@this.gmail.com), February 2, 2013 1:13 pm
Room: Moderated Discussions
none (none.delete@this.none.com) on February 2, 2013 10:43 am wrote:
> Patrick Chase (patrickjchase.delete@this.gmail.com) on February 2, 2013 10:11 am wrote:
> > The A9s ROB is 24 entries, which means that its speculation/reorder window is *very* limited. The
> > ROB_size/issue_rate ratio (basically a metric of how many clocks of instructions the ROB can hold
> > at full issue rate) is only 12, which isn't even enough to "cover" an L1 miss that hits L2.
>
> That 24 entries figure is wrong and anyway A9 has no ROB as found in other OoO CPU :)
ARM describes the A9 as doing "out of order issue". The TRM further describes it as using register renaming to resolve WAW/WAR hazards without stalling, which implies a Tomasulo machine or similar. The ARM ISA requires precise exceptions. I've developed OS code including fault handlers and context-switching for A9, so I *know* it implements precise exceptions.
I'm not aware of a means of implementing out-of-order issue with non-stalling resolution of WAR/WAW and precise exceptions without some structure equivalent to an ROB in function if not in name. I'm aware of the ARM slideset that says that A9 does OoO "without a power-hungry ROB" but I suspect that's misworded and simply means that they used a physical register file to avoid Tomasulo's reservation stations and common results bus (just like Sandy/Ivy Bridge and many other recent OoO microarchitectures). The ROB itself isn't particularly power-hungry.
Can you explain how A9 achieves out-of-order issue with renaming and precise exceptions without an ROB or equivalent?
> Anyway what I think matters is that A9 was well balanced both on the core and
> the data side. And the mistake of a non-pipelined FPU was not remade...
Agreed, that was one of the things I was thinking of when I remarked that the A9 had more going for it relative to A8 than just OoO. Most customers who needed FP put Neon units on their A8s, though that obviously only works if you recompile. VFP in A8 was indeed quite lame.
> Patrick Chase (patrickjchase.delete@this.gmail.com) on February 2, 2013 10:11 am wrote:
> > The A9s ROB is 24 entries, which means that its speculation/reorder window is *very* limited. The
> > ROB_size/issue_rate ratio (basically a metric of how many clocks of instructions the ROB can hold
> > at full issue rate) is only 12, which isn't even enough to "cover" an L1 miss that hits L2.
>
> That 24 entries figure is wrong and anyway A9 has no ROB as found in other OoO CPU :)
ARM describes the A9 as doing "out of order issue". The TRM further describes it as using register renaming to resolve WAW/WAR hazards without stalling, which implies a Tomasulo machine or similar. The ARM ISA requires precise exceptions. I've developed OS code including fault handlers and context-switching for A9, so I *know* it implements precise exceptions.
I'm not aware of a means of implementing out-of-order issue with non-stalling resolution of WAR/WAW and precise exceptions without some structure equivalent to an ROB in function if not in name. I'm aware of the ARM slideset that says that A9 does OoO "without a power-hungry ROB" but I suspect that's misworded and simply means that they used a physical register file to avoid Tomasulo's reservation stations and common results bus (just like Sandy/Ivy Bridge and many other recent OoO microarchitectures). The ROB itself isn't particularly power-hungry.
Can you explain how A9 achieves out-of-order issue with renaming and precise exceptions without an ROB or equivalent?
> Anyway what I think matters is that A9 was well balanced both on the core and
> the data side. And the mistake of a non-pipelined FPU was not remade...
Agreed, that was one of the things I was thinking of when I remarked that the A9 had more going for it relative to A8 than just OoO. Most customers who needed FP put Neon units on their A8s, though that obviously only works if you recompile. VFP in A8 was indeed quite lame.