By: Patrick Chase (patrickjchase.delete@this.gmail.com), February 2, 2013 2:01 pm
Room: Moderated Discussions
Patrick Chase (patrickjchase.delete@this.gmail.com) on February 2, 2013 1:13 pm wrote:
> none (none.delete@this.none.com) on February 2, 2013 10:43 am wrote:
> > That 24 entries figure is wrong and anyway A9 has no ROB as found in other OoO CPU :)
>
> ARM describes the A9 as doing "out of order issue". The TRM further describes it as using register
> renaming to resolve WAW/WAR hazards without stalling, which implies a Tomasulo machine or similar.
> The ARM ISA requires precise exceptions. I've developed OS code including fault handlers and
> context-switching for A9, so I *know* it implements precise exceptions.
>
> I'm not aware of a means of implementing out-of-order issue with non-stalling resolution of WAR/WAW
> and precise exceptions without some structure equivalent to an ROB in function if not in name.
> I'm aware of the ARM slideset that says that A9 does OoO "without a power-hungry ROB" but I suspect
> that's misworded and simply means that they used a physical register file to avoid Tomasulo's
> reservation stations and common results bus (just like Sandy/Ivy Bridge and many other recent
> OoO microarchitectures). The ROB itself isn't particularly power-hungry.
>
> Can you explain how A9 achieves out-of-order issue with renaming
> and precise exceptions without an ROB or equivalent?
Sorry to follow-up my own post again, but...
I did some digging and found multiple sources that refer to a 24-entry "data-less ROB" in A9. Unfortunately ARM has pulled down the original documents (particularly the devcon 2007 A9 architecture slides) that those sources refer to.
That tends to reinforce what I hypothesized above: It uses a PRF instead of reservation stations and a common results bus, so the ROB only needs to track instruction order and state (speculative or not, written back to PRF or not) as opposed to instruction results Hence "data-less", just like Sandy Bridge, Bobcat, and a whole lot of other modern OoO microarchitectures :-). It's still an ROB, though.
-- Patrick
> none (none.delete@this.none.com) on February 2, 2013 10:43 am wrote:
> > That 24 entries figure is wrong and anyway A9 has no ROB as found in other OoO CPU :)
>
> ARM describes the A9 as doing "out of order issue". The TRM further describes it as using register
> renaming to resolve WAW/WAR hazards without stalling, which implies a Tomasulo machine or similar.
> The ARM ISA requires precise exceptions. I've developed OS code including fault handlers and
> context-switching for A9, so I *know* it implements precise exceptions.
>
> I'm not aware of a means of implementing out-of-order issue with non-stalling resolution of WAR/WAW
> and precise exceptions without some structure equivalent to an ROB in function if not in name.
> I'm aware of the ARM slideset that says that A9 does OoO "without a power-hungry ROB" but I suspect
> that's misworded and simply means that they used a physical register file to avoid Tomasulo's
> reservation stations and common results bus (just like Sandy/Ivy Bridge and many other recent
> OoO microarchitectures). The ROB itself isn't particularly power-hungry.
>
> Can you explain how A9 achieves out-of-order issue with renaming
> and precise exceptions without an ROB or equivalent?
Sorry to follow-up my own post again, but...
I did some digging and found multiple sources that refer to a 24-entry "data-less ROB" in A9. Unfortunately ARM has pulled down the original documents (particularly the devcon 2007 A9 architecture slides) that those sources refer to.
That tends to reinforce what I hypothesized above: It uses a PRF instead of reservation stations and a common results bus, so the ROB only needs to track instruction order and state (speculative or not, written back to PRF or not) as opposed to instruction results Hence "data-less", just like Sandy Bridge, Bobcat, and a whole lot of other modern OoO microarchitectures :-). It's still an ROB, though.
-- Patrick