By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), October 1, 2009 5:09 pm
Room: Moderated Discussions
Mat (m@t.com) on 10/1/09 wrote:
---------------------------
>>>>arm did come up with a few tricks to lower the power
>>>>consumption of oooe, e.g. use a checkpointing mechanism to
>>>>reduce the power required to keep track of which renamed
>>>>registered can be freed.
>>>>
>>>Where does it say that checkpointing is used in the A9?
>>
>>read the patents.
>
>Good point, doing a quick search on renaming yields interesting results, many have
>pipeline diagrams looking like the A9 whitepaper:
>
>http://v3.espacenet.com/searchResults?bookmarkedResults=true&submitted=true&DB=EPODOC&locale=en_gb&sf=a&FIRST=1&CY=gb&LG=en&&TI=&AB=renam*&PN=&AP=&PR=&PD=&PA=ARM&IN=&EC=&IC=&=&=&=&=&=
Interesting, the patents contain quite a lot of info. Some of the key details of the OOOE:
* 56 physical registers (32 architectural, 24 rename, ie. instruction window of up to 24 instructions or ~12 cycles, enough to hide a L1 miss)
* integer renaming only, VFP/Neon are in-order
* separate rename and restore table, allowing single cycle restore after exceptions, mispredicts or interrupts
* loads/stores/branches and conditionally executed instructions are speculated
* cache/TLB miss, ECC failure, data abort etc is checked after critical data is returned from the L1 cache, so load data may be invalid
* instructions are grouped with a previous speculated instruction. The rename mappings for each group are placed in a FIFO
* the restore table is updated with the rename mappings of the oldest group when its speculated instruction completes. This means the restore table contains the mapping of non speculated instructions (rolling back to this state gives the correct architectural state on an exception/mispredict etc). Any instructions in this group no longer need to be tracked and are considered completed (even if they haven't produced their result yet).
* the FIFO is also used to free unused rename registers as groups complete (this means that rename registers are often freed before the instruction computing the new value for the same architectural register completes)
* a register is reused if the register has been written, no reads are waiting to be issued, and it is no longer speculated or in the restore table
* instructions needing many rename registers (ldm/stm) get a small number of renames, and receive more rename registers while they execute to avoid blocking renaming of later instructions or using up most rename registers before actually needing them
>And noticing the inventors are French it seems that A9 was developed by the Sophia team of only 30 engineers. http://www.newelectronics.co.uk/article/11633/ARM-expands-Cortex-range.aspx
>>
>
>I find that quite astonishing given the complexity of building an OOOE machine. Even more curious then that simpler in order A8 was developed by 70 odd engineers - http://www2.electronicproducts.com/Product_of_the_Year_Story_Behind_the_Story_The_ARM_Cortex-A8_Processor_IP-article-sbsjh-arm-mar2006-html.aspx
It's even more impressive given that Cortex-A8 took longer.
>Any idea on the team size for Atom, can't believe it's the whole Austin centre?
Wilco
---------------------------
>>>>arm did come up with a few tricks to lower the power
>>>>consumption of oooe, e.g. use a checkpointing mechanism to
>>>>reduce the power required to keep track of which renamed
>>>>registered can be freed.
>>>>
>>>Where does it say that checkpointing is used in the A9?
>>
>>read the patents.
>
>Good point, doing a quick search on renaming yields interesting results, many have
>pipeline diagrams looking like the A9 whitepaper:
>
>http://v3.espacenet.com/searchResults?bookmarkedResults=true&submitted=true&DB=EPODOC&locale=en_gb&sf=a&FIRST=1&CY=gb&LG=en&&TI=&AB=renam*&PN=&AP=&PR=&PD=&PA=ARM&IN=&EC=&IC=&=&=&=&=&=
Interesting, the patents contain quite a lot of info. Some of the key details of the OOOE:
* 56 physical registers (32 architectural, 24 rename, ie. instruction window of up to 24 instructions or ~12 cycles, enough to hide a L1 miss)
* integer renaming only, VFP/Neon are in-order
* separate rename and restore table, allowing single cycle restore after exceptions, mispredicts or interrupts
* loads/stores/branches and conditionally executed instructions are speculated
* cache/TLB miss, ECC failure, data abort etc is checked after critical data is returned from the L1 cache, so load data may be invalid
* instructions are grouped with a previous speculated instruction. The rename mappings for each group are placed in a FIFO
* the restore table is updated with the rename mappings of the oldest group when its speculated instruction completes. This means the restore table contains the mapping of non speculated instructions (rolling back to this state gives the correct architectural state on an exception/mispredict etc). Any instructions in this group no longer need to be tracked and are considered completed (even if they haven't produced their result yet).
* the FIFO is also used to free unused rename registers as groups complete (this means that rename registers are often freed before the instruction computing the new value for the same architectural register completes)
* a register is reused if the register has been written, no reads are waiting to be issued, and it is no longer speculated or in the restore table
* instructions needing many rename registers (ldm/stm) get a small number of renames, and receive more rename registers while they execute to avoid blocking renaming of later instructions or using up most rename registers before actually needing them
>And noticing the inventors are French it seems that A9 was developed by the Sophia team of only 30 engineers. http://www.newelectronics.co.uk/article/11633/ARM-expands-Cortex-range.aspx
>>
>
>I find that quite astonishing given the complexity of building an OOOE machine. Even more curious then that simpler in order A8 was developed by 70 odd engineers - http://www2.electronicproducts.com/Product_of_the_Year_Story_Behind_the_Story_The_ARM_Cortex-A8_Processor_IP-article-sbsjh-arm-mar2006-html.aspx
It's even more impressive given that Cortex-A8 took longer.
>Any idea on the team size for Atom, can't believe it's the whole Austin centre?
Wilco
Topic | Posted By | Date |
---|---|---|
Thoughts and questions on the Cortex A9 | Gabriele Svelto | 2009/09/26 01:46 AM |
Thoughts and questions on the Cortex A9 | none | 2009/09/26 02:27 AM |
Thoughts and questions on the Cortex A9 | jeff | 2009/09/27 04:06 AM |
Thoughts and questions on the Cortex A9 | Michael S | 2009/09/27 04:29 AM |
Thoughts and questions on the Cortex A9 | none | 2009/09/27 05:01 AM |
Thoughts and questions on the Cortex A9 | Howard Chu | 2009/09/27 09:39 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/27 06:03 AM |
Thoughts and questions on the Cortex A9 | jeff | 2009/09/27 07:00 AM |
Thoughts and questions on the Cortex A9 | a reader | 2009/09/27 07:17 AM |
Thoughts and questions on the Cortex A9 | David Kanter | 2009/09/27 07:37 AM |
Thoughts and questions on the Cortex A9 | a reader | 2009/09/27 07:46 AM |
Thoughts and questions on the Cortex A9 | Mat | 2009/10/01 12:04 PM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/10/01 05:09 PM |
Thoughts and questions on the Cortex A9 | anon | 2009/10/01 07:19 PM |
Thoughts and questions on the Cortex A9 | RagingDragon | 2009/09/28 04:11 PM |
Thoughts and questions on the Cortex A9 | Linus Torvalds | 2009/09/27 08:05 AM |
OOO hw vs SW&in-order hw | no thanks | 2009/09/27 03:47 PM |
OOO hw vs SW&in-order hw | Linus Torvalds | 2009/09/28 05:22 AM |
OOO hw vs SW&in-order hw | ? | 2009/09/28 10:37 AM |
OOO hw vs SW&in-order hw | RagingDragon | 2009/09/28 04:22 PM |
OOO hw vs SW&in-order hw | Megol | 2009/09/29 03:35 AM |
OOO hw vs SW&in-order hw | Anders Jensen | 2009/09/28 10:50 PM |
OOO hw vs SW&in-order hw | Linus Torvalds | 2009/09/29 06:44 AM |
OOO hw vs SW&in-order hw | Mark Roulo | 2009/09/29 08:58 AM |
OOO hw vs SW&in-order hw | Linus Torvalds | 2009/09/29 09:30 AM |
3- and 4-issue in-order CPUs | Mark Roulo | 2009/09/29 10:06 AM |
3- and 4-issue in-order CPUs | Linus Torvalds | 2009/09/29 10:29 AM |
3- and 4-issue in-order CPUs | Gian-Carlo Pascutto | 2009/09/29 11:35 PM |
3- and 4-issue in-order CPUs | Michael S | 2009/09/30 01:01 AM |
OOO hw vs SW&in-order hw | mpx | 2009/09/30 03:14 AM |
OOO hw vs SW&in-order hw | Pun Zu | 2009/10/02 01:44 AM |
OOO hw vs SW&in-order hw | none | 2009/10/02 04:22 AM |
OOO hw vs SW&in-order hw | Linus Torvalds | 2009/10/02 06:11 AM |
OOO hw vs SW&in-order hw | a reader | 2009/10/02 08:30 AM |
OOO hw vs SW&in-order hw | Linus Torvalds | 2009/10/02 08:59 AM |
Moorestown | David Kanter | 2009/10/02 09:59 AM |
What's the difference between Moorestown and Pine Trail cores? | anon | 2009/10/03 07:37 PM |
Moorestown | none | 2009/11/03 03:34 PM |
Moorestown | Anon | 2009/11/04 02:17 PM |
Moorestown | none | 2009/11/05 12:38 AM |
Moorestown | David Kanter | 2009/11/05 03:45 PM |
Moorestown | IntelUser2000 | 2009/11/06 03:17 AM |
Moorestown | Anon | 2009/11/06 12:51 PM |
Moorestown | none | 2009/11/07 06:07 AM |
OOO hw vs SW&in-order hw | Anon | 2009/10/02 06:55 PM |
Cluebat for graphics | David Kanter | 2009/10/02 08:19 PM |
Cluebat for graphics | Anon | 2009/10/03 04:45 PM |
Cluebat for graphics | David Kanter | 2009/10/04 12:57 AM |
Cluebat for graphics | Anon | 2009/10/04 07:15 PM |
Cluebat for graphics | David Kanter | 2009/10/05 02:09 AM |
Cluebat for graphics | Anon | 2009/10/05 02:36 PM |
Cluebat for graphics | David Kanter | 2009/10/05 08:54 PM |
Cluebat for graphics | Anon | 2009/10/06 04:58 PM |
OOO hw vs SW&in-order hw | Linus Torvalds | 2009/10/03 05:58 AM |
OOO hw vs SW&in-order hw | slacker | 2009/10/02 08:11 PM |
Linux graphics drivers | RagingDragon | 2009/10/03 07:27 PM |
Linux graphics drivers | anon | 2009/10/04 06:15 AM |
Linux graphics drivers | none | 2009/10/04 09:12 AM |
Thoughts and questions on the Cortex A9 | jeff | 2009/09/27 05:31 PM |
Thoughts and questions on the Cortex A9 | someone | 2009/09/27 08:30 AM |
Thoughts and questions on the Cortex A9 | none | 2009/09/27 09:09 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/27 10:35 AM |
Thoughts and questions on the Cortex A9 | someone | 2009/09/27 10:55 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/28 01:08 AM |
Thoughts and questions on the Cortex A9 | someone | 2009/09/28 04:58 AM |
Thoughts and questions on the Cortex A9 | none | 2009/09/28 05:18 AM |
Thoughts and questions on the Cortex A9 | someone | 2009/09/28 06:35 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/28 07:25 AM |
Thoughts and questions on the Cortex A9 | Michael S | 2009/09/28 10:02 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/29 12:35 AM |
Thoughts and questions on the Cortex A9 | Chuck | 2009/09/28 06:15 PM |
samples | AM | 2009/09/27 10:20 PM |
samples | Wilco | 2009/09/28 12:51 AM |
samples | AM | 2009/09/28 03:16 AM |
Shrinks and process tech | David Kanter | 2009/09/29 12:22 AM |
Thoughts and questions on the Cortex A9 | someone | 2009/09/27 10:42 AM |
Thoughts and questions on the Cortex A9 | none | 2009/09/27 11:52 AM |
Atom to stay in-oder or go OoO? | AM | 2009/09/27 10:09 PM |
Atom to stay in-oder or go OoO? | Ungo | 2009/09/28 04:34 AM |
Atom to stay in-oder or go OoO? | a reader | 2009/09/28 09:15 AM |
Atom to stay in-oder or go OoO? | anon | 2009/09/28 06:25 PM |
Atom to stay in-oder or go OoO? | AM | 2009/09/30 02:32 AM |
Atom to stay in-oder or go OoO? | baxeel | 2009/09/30 07:25 AM |
Atom to stay in-oder or go OoO? | AM | 2009/09/30 10:12 PM |
Atom to stay in-oder or go OoO? | Ungo | 2009/10/01 02:00 AM |
Atom to stay in-oder or go OoO? | AM | 2009/10/01 04:08 AM |
Atom to stay in-oder or go OoO? | anonymous | 2009/10/01 04:33 AM |
Atom to stay in-oder or go OoO? | AM | 2009/10/03 06:24 AM |
Atom to stay in-oder or go OoO? | Pun Zu | 2009/10/02 12:30 AM |
Atom to stay in-oder or go OoO? | Ungo | 2009/10/02 12:11 PM |
Atom to stay in-oder or go OoO? | AM | 2009/10/03 06:22 AM |
Atom to stay in-oder or go OoO? | Ungo | 2009/10/03 01:53 PM |
Atom to stay in-oder or go OoO? | AM | 2009/10/04 07:44 AM |
Atom to stay in-oder or go OoO? | David Kanter | 2009/10/04 10:02 PM |
Atom to stay in-oder or go OoO? | AM | 2009/10/05 06:18 AM |
Atom to stay in-oder or go OoO? | David Kanter | 2009/10/05 10:12 AM |
Atom to stay in-oder or go OoO? | AM | 2009/10/06 03:51 AM |
Atom to stay in-oder or go OoO? | anonymous | 2009/10/06 06:58 AM |
Do you have any proof? | David Kanter | 2009/10/06 08:58 AM |
Do you? | AM | 2009/10/06 10:30 PM |
Of course I do! | anonymous | 2009/10/07 04:58 AM |
Thanks :-) | AM | 2009/10/08 02:17 AM |
Thanks :-) | anonymous | 2009/10/08 04:52 AM |
Thanks :-) | AM | 2009/10/09 02:13 AM |
Thanks :-) | anonymous | 2009/10/09 05:03 AM |
Thanks :-) | Foo_ | 2009/10/09 05:47 AM |
Thanks :-) | AM | 2009/10/10 12:15 AM |
That's what I thought... | David Kanter | 2009/10/07 08:00 AM |
That's what I thought... | AM | 2009/10/08 02:26 AM |
That's what I thought... | anonymous | 2009/10/08 05:02 AM |
let's see... | AM | 2009/10/09 02:09 AM |
let's see... | anonymous | 2009/10/09 04:43 AM |
let's see... | AM | 2009/10/09 04:52 AM |
let's see... | anonymous | 2009/10/09 05:15 AM |
let's see... | AM | 2009/10/10 12:18 AM |
Atom to stay in-oder or go OoO? | someone | 2009/09/28 05:09 AM |
I call Troll | hobold | 2009/09/28 03:51 AM |
I call Troll | someone | 2009/09/28 05:15 AM |
OT: categories of motivation in a forum | hobold | 2009/09/29 05:01 AM |
Thoughts and questions on the Cortex A9 | Michael S | 2009/09/28 09:43 AM |
Thoughts and questions on the Cortex A9 | a reader | 2009/09/28 03:12 PM |
Thoughts and questions on the Cortex A9 | someone else | 2009/09/28 11:25 PM |
Why Cortex A9? | hobold | 2009/09/29 06:20 AM |
Why Cortex A9? | someone else | 2009/09/29 09:57 AM |
Why Cortex A9? | Richard Cownie | 2009/09/29 05:09 PM |
Why Cortex A9? | hobold | 2009/09/29 11:38 PM |
Why Cortex A9? | Richard Cownie | 2009/09/30 05:49 AM |
Why Cortex A9? | hobold | 2009/09/30 06:46 AM |
Why Cortex A9? | none | 2009/09/30 06:56 AM |
Marvell Sheeva and plug computing | Richard Cownie | 2009/09/30 08:03 AM |
Why Cortex A9? | Michael S | 2009/09/30 09:07 AM |
Why Cortex A9? | none | 2009/09/30 09:40 AM |
Why Cortex A9? | Gabriele Svelto | 2009/09/30 11:43 AM |
ARM architectural license | David Kanter | 2009/09/30 04:57 PM |
ARM architectural license | a reader | 2009/10/01 06:25 AM |
ARM architectural license | Richard Cownie | 2009/10/01 07:21 AM |
Why Cortex A9? | slacker | 2009/09/30 06:12 PM |
ARM architectural license | David Kanter | 2009/09/30 06:16 PM |
Why Cortex A9? | Michael S | 2009/10/01 06:45 AM |
Why Cortex A9? | slacker | 2009/10/02 01:41 AM |
Why Cortex A9? | Richard Cownie | 2009/10/02 09:28 AM |
Questions... | David Kanter | 2009/10/02 09:56 AM |
Questions... | Richard Cownie | 2009/10/02 10:29 AM |
Questions... | Wilco | 2009/10/02 12:05 PM |
Questions... | slacker | 2009/10/02 07:51 PM |
Why Cortex A9? | slacker | 2009/10/02 07:44 PM |
Why Cortex A9? | David W. Hess | 2009/09/30 07:42 AM |
Thoughts and questions on the Cortex A9 | Gabriele Svelto | 2009/09/28 12:28 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/26 06:38 AM |
Thoughts and questions on the Cortex A9 | Gabriele Svelto | 2009/09/28 12:38 AM |
Thoughts and questions on the Cortex A9 | Costanza | 2009/10/01 02:45 PM |
Thoughts and questions on the Cortex A9 | sylt | 2009/09/28 04:54 AM |
Thoughts and questions on the Cortex A9 | Wilco | 2009/09/29 12:15 AM |