By: David Kanter (dkanter.delete@this.realworldtech.com), April 26, 2012 10:53 pm
Room: Moderated Discussions
Exophase (exophase@gmail.com) on 4/26/12 wrote:
---------------------------
>David Kanter (dkanter@realworldtech.com) on 4/26/12 wrote:
>---------------------------
>>I think the first question to ask is whether it matters. In many respects, this
>>arrangement seems to mirror the design of the K8. Each 'lane' has an AGU and ALU,
>>to handle load+op. Forwarding flows out of the lane from the ALU.
>>
>>Looking from a SW perspective, what is the case that AGU0-->AGU1 or AGU0-->ALU1 handles?
>>
>>If you fire off a load from AGU0, you get the result in a register. If some other
>>ALU/AGU needs that, it can probably get the result from the L1D forwarding or from the register file directly.
>>
>
>Yes, I was just looking at load to load or multiple ALU >accesses off of the same
>load.
Maybe just do LD-->LD in the same pipe?
>I took "bypass" here to mean register forwarding (ie, bypassing the register
>file) but still ending up a register. Should I have taken >it to refer to intermediate
>results only, like those from load + op?
I meant both.
>>Hrmmm, why? Intel's AGUs do nothing but address generation IIRC.
>>
>>DK
>
>Would appear so. Somehow I was under the impression that ports 2 through 4 could
>handle something outside of loads and stores, but I was mistaken. I suppose there's
>less need when they have three other ports. It does appear that AMD is at least
>trying to use the AGLUs for something outside of address generation. Using inc/dec
>in the post-adjust sense threw a lot of people off but makes much more sense. BTW,
>does this mean that the pop instruction only issues to the >AGUs?
Not sure about PUSH/POP.
>I think I've been thrown off by some misinformation here and have only been gradually
>getting it straight. Months before BD was released JF-AMD made the claim that K10
>had 3 execution ports in which you can do either AGU or ALU, while BD can do 2x
>ALU + 2x AGU simultaneously and therefore had higher peak throughput. I was pretty
>sure this was nonsense (that K8/K10 could do all 6 + 3 FPU ops) but I figured that
>the AGUs would at least be somewhat less coupled to the >ALUs.
JF is a nice guy, but relying on him for architectural details is a bit silly.
>The other misleading part is the rhetoric about how "that third ALU" was almost
>never used. In reality the EX ports on BD, which are issued to in everything but
>loads. I'm sure that ALU frequency analysis involves actual ALU operations only,
>while the EX units will face additional contention from branches and stores which
>comprise a large number of instructions.
>May be the same execution unit pair arrangement as K8, but >missing that third part
>is pretty costly (minus the useless AGU)
Actually the AGU is quite useful for a design where you can't re-order.
David
---------------------------
>David Kanter (dkanter@realworldtech.com) on 4/26/12 wrote:
>---------------------------
>>I think the first question to ask is whether it matters. In many respects, this
>>arrangement seems to mirror the design of the K8. Each 'lane' has an AGU and ALU,
>>to handle load+op. Forwarding flows out of the lane from the ALU.
>>
>>Looking from a SW perspective, what is the case that AGU0-->AGU1 or AGU0-->ALU1 handles?
>>
>>If you fire off a load from AGU0, you get the result in a register. If some other
>>ALU/AGU needs that, it can probably get the result from the L1D forwarding or from the register file directly.
>>
>
>Yes, I was just looking at load to load or multiple ALU >accesses off of the same
>load.
Maybe just do LD-->LD in the same pipe?
>I took "bypass" here to mean register forwarding (ie, bypassing the register
>file) but still ending up a register. Should I have taken >it to refer to intermediate
>results only, like those from load + op?
I meant both.
>>Hrmmm, why? Intel's AGUs do nothing but address generation IIRC.
>>
>>DK
>
>Would appear so. Somehow I was under the impression that ports 2 through 4 could
>handle something outside of loads and stores, but I was mistaken. I suppose there's
>less need when they have three other ports. It does appear that AMD is at least
>trying to use the AGLUs for something outside of address generation. Using inc/dec
>in the post-adjust sense threw a lot of people off but makes much more sense. BTW,
>does this mean that the pop instruction only issues to the >AGUs?
Not sure about PUSH/POP.
>I think I've been thrown off by some misinformation here and have only been gradually
>getting it straight. Months before BD was released JF-AMD made the claim that K10
>had 3 execution ports in which you can do either AGU or ALU, while BD can do 2x
>ALU + 2x AGU simultaneously and therefore had higher peak throughput. I was pretty
>sure this was nonsense (that K8/K10 could do all 6 + 3 FPU ops) but I figured that
>the AGUs would at least be somewhat less coupled to the >ALUs.
JF is a nice guy, but relying on him for architectural details is a bit silly.
>The other misleading part is the rhetoric about how "that third ALU" was almost
>never used. In reality the EX ports on BD, which are issued to in everything but
>loads. I'm sure that ALU frequency analysis involves actual ALU operations only,
>while the EX units will face additional contention from branches and stores which
>comprise a large number of instructions.
>May be the same execution unit pair arrangement as K8, but >missing that third part
>is pretty costly (minus the useless AGU)
Actually the AGU is quite useful for a design where you can't re-order.
David
Topic | Posted By | Date |
---|---|---|
Phoronix tests GCC compiler flags and Bulldozer. | I.S.T. | 2012/04/19 03:05 AM |
Single page view? | David Kanter | 2012/04/19 08:59 AM |
Single page view? | wainwright | 2012/04/19 09:22 AM |
Single page view? | slothrop | 2012/04/19 09:23 AM |
Single page view? | David Kanter | 2012/04/19 09:31 AM |
Single page view? | EduardoS | 2012/04/19 03:12 PM |
Is there a single page view option for RWT articles? | anon | 2012/04/19 09:27 AM |
Single page view? | Del | 2012/04/19 09:36 AM |
Single page view? | slacker | 2012/04/19 03:56 PM |
Single page view? | Del | 2012/04/22 06:09 AM |
Single page view? | David Kanter | 2012/04/22 09:38 AM |
Single page view? | Del | 2012/04/23 01:22 AM |
Single page view? | Michael S | 2012/04/19 01:30 PM |
Single page view? | Ungo | 2012/04/19 02:25 PM |
Single page view? | Foo_ | 2012/04/20 12:17 AM |
Single page view? | James | 2012/04/20 04:01 AM |
There are ads on the web? | JJB | 2012/04/20 04:32 AM |
What a bunch of freeloaders (NT) | slacker | 2012/04/20 01:44 PM |
So are you, probably | iz | 2012/04/21 04:41 AM |
Impression ad revenue | Paul A. Clayton | 2012/04/21 06:44 AM |
So are you, probably | slacker | 2012/04/21 01:09 PM |
So are you, probably | David Kanter | 2012/04/22 09:41 AM |
So are you, probably | iz | 2012/04/22 03:57 PM |
So are you, probably | Doug Siebert | 2012/04/22 12:37 PM |
Aha! | David Kanter | 2012/04/22 03:45 PM |
Aha! | bakaneko | 2012/04/22 08:49 PM |
So are you, probably | iz | 2012/04/22 03:48 PM |
That's not how the business works... | David Kanter | 2012/04/22 05:31 PM |
That's not how the business works... | iz | 2012/04/23 01:49 AM |
So are you, probably | slacker | 2012/04/22 11:31 PM |
back to phoronix | Michael S | 2012/04/23 02:07 AM |
So are you, probably | iz | 2012/04/23 03:29 AM |
Membership at RWT | David Kanter | 2012/04/23 11:24 AM |
So are you, probably | Jukka Larja | 2012/04/27 08:59 AM |
So, what do people think of these numbers> | I.S.T. | 2012/04/19 07:34 PM |
So, what do people think of these numbers> | Linus Torvalds | 2012/04/20 08:34 AM |
So, what do people think of these numbers> | Kira | 2012/04/20 09:18 AM |
So, what do people think of these numbers> | Linus Torvalds | 2012/04/20 10:05 AM |
So, what do people think of these numbers> | Doug Siebert | 2012/04/20 09:00 PM |
So, what do people think of these numbers> | Megol | 2012/04/21 09:05 AM |
So, what do people think of these numbers> | Linus Torvalds | 2012/04/21 01:11 PM |
Most problems are fixed... | Megol | 2012/04/24 07:00 AM |
So, what do people think of these numbers> | bakaneko | 2012/04/20 11:16 AM |
So, what do people think of these numbers> | bakaneko | 2012/04/20 11:37 AM |
So, what do people think of these numbers> | Linus Torvalds | 2012/04/20 01:24 PM |
So, what do people think of these numbers> | Joel | 2012/04/20 02:59 PM |
So, what do people think of these numbers> | Kira | 2012/04/20 03:32 PM |
So, what do people think of these numbers> | EduardoS | 2012/04/20 04:00 PM |
Bulldozer's Oddities. | Joel | 2012/04/20 04:54 PM |
In defense of Bulldozer's Oddities | David Kanter | 2012/04/20 05:32 PM |
In defense of Bulldozer's Oddities | Exophase | 2012/04/20 07:11 PM |
In defense of Bulldozer's Oddities | EduardoS | 2012/04/20 07:46 PM |
In defense of Bulldozer's Oddities | Exophase | 2012/04/20 08:18 PM |
In defense of Bulldozer's Oddities | anonymous | 2012/04/20 11:26 PM |
In defense of Bulldozer's Oddities | JJB | 2012/04/20 11:34 PM |
In defense of Bulldozer's Oddities | imaxx | 2012/04/21 07:21 AM |
In defense of Bulldozer's Oddities | Michael S | 2012/04/21 10:42 AM |
Bulldozer's integer execution units | David Kanter | 2012/04/25 04:29 PM |
Bulldozer's integer execution units | Exophase | 2012/04/26 12:17 PM |
Bulldozer's integer execution units | anonymous | 2012/04/26 03:15 PM |
Bulldozer's integer execution units | EduardoS | 2012/04/26 03:40 PM |
Bulldozer's integer execution units | Foo_ | 2012/04/27 08:21 AM |
Bulldozer's integer execution units | Megol | 2012/04/27 01:38 PM |
Bulldozer's integer execution units | EduardoS | 2012/04/26 03:47 PM |
Bulldozer's integer execution units | Exophase | 2012/04/26 05:02 PM |
Bulldozer's integer execution units | EduardoS | 2012/04/26 06:03 PM |
Bulldozer's integer execution units | Exophase | 2012/04/26 06:24 PM |
Bulldozer's integer execution units | EduardoS | 2012/04/26 07:18 PM |
Bulldozer's cache memory performance | Heikki Kultala | 2012/04/28 01:18 AM |
Bulldozer's cache memory performance | EduardoS | 2012/04/28 10:06 AM |
Bulldozer's integer execution units | David Kanter | 2012/04/26 04:03 PM |
Bulldozer's integer execution units | Exophase | 2012/04/26 04:59 PM |
Bulldozer's integer execution units | David Kanter | 2012/04/26 10:53 PM |
Bulldozer's integer execution units | Exophase | 2012/04/27 08:42 AM |
Bulldozer's integer execution units | David Kanter | 2012/04/27 11:06 AM |
Bulldozer's integer execution units | EduardoS | 2012/04/27 01:27 PM |
K8 divided pipelines? | Paul A. Clayton | 2012/04/27 01:59 PM |
Bulldozer's integer execution units | Michael S | 2012/04/27 04:37 AM |
Bulldozer's integer execution units | Exophase | 2012/04/27 08:33 AM |
Bulldozer's integer execution units | anonymous | 2012/04/27 09:03 AM |
Renaming Flags | Konrad Schwarz | 2012/04/27 03:04 AM |
Renaming Flags | none | 2012/04/27 04:03 AM |
Renaming Flags | Megol | 2012/04/27 12:42 PM |
Bulldozer's integer execution units | hcl64 | 2012/04/27 04:31 PM |
VEX supports 3+ operands. FPU have renaming already(NT) | Megol | 2012/04/28 08:20 AM |
In defense of Bulldozer's Oddities | Linus Torvalds | 2012/04/21 12:26 PM |
Thanks for the lesson | JJB | 2012/04/21 02:23 PM |
Side note.. | Linus Torvalds | 2012/04/21 02:57 PM |
In defense of Bulldozer's Oddities | Exophase | 2012/04/21 12:13 PM |
In defense of Bulldozer's Oddities | EduardoS | 2012/04/21 12:53 PM |
In defense of Bulldozer's Oddities | Gionatan Danti | 2012/04/21 12:42 PM |
In defense of Bulldozer's Oddities | hcl64 | 2012/04/27 05:07 PM |
In defense of Bulldozer's Oddities | David Kanter | 2012/04/28 06:29 AM |
In defense of Bulldozer's Oddities | hcl64 | 2012/04/28 02:44 PM |
In defense of Bulldozer's Oddities | David Kanter | 2012/04/28 09:42 PM |
In defense of Bulldozer's Oddities | hcl64 | 2012/04/28 10:39 PM |
Bulldozer's Oddities. | EduardoS | 2012/04/20 06:05 PM |
Bulldozer's Oddities. | anon | 2012/04/20 08:32 PM |
Bulldozer's Oddities. | EduardoS | 2012/04/21 12:37 PM |
Bulldozer's Oddities. | anon | 2012/04/21 10:16 PM |
Bulldozer's Oddities. | EduardoS | 2012/04/21 10:43 PM |
Bulldozer's Oddities. | anon | 2012/04/22 02:09 AM |
Bulldozer's Oddities. | EduardoS | 2012/04/22 01:57 PM |
Bulldozer's Oddities. | anon | 2012/04/22 04:17 PM |
Bulldozer's Oddities. | EduardoS | 2012/04/22 05:05 PM |
Bulldozer's Oddities. | anon | 2012/04/22 05:42 PM |
Bulldozer's Oddities. | anon | 2012/04/22 06:01 PM |
Bulldozer's Oddities. | EduardoS | 2012/04/22 10:28 PM |
Bulldozer's Oddities. | anon | 2012/04/22 11:05 PM |
Bulldozer's isn't bad. | a reader | 2012/04/21 10:01 AM |
Bulldozer's isn't bad. | Kira | 2012/04/21 11:29 AM |
Bulldozer's isn't bad. | hcl64 | 2012/04/27 05:58 PM |
Bulldozer's isn't bad. | anon | 2012/04/27 06:16 PM |
Bulldozer's isn't bad. | hcl64 | 2012/04/27 07:33 PM |
Bulldozer's isn't bad. | rwessel | 2012/04/27 11:12 PM |
Bulldozer's isn't bad. | EduardoS | 2012/04/28 09:29 AM |
Bulldozer's isn't bad. | EduardoS | 2012/04/28 09:30 AM |
Bulldozer's isn't bad. | Michael S | 2012/04/28 12:36 PM |
Bulldozer is made for SPEC fp | Pelle-48 | 2012/04/21 11:41 AM |
Bulldozer's Oddities. | mpx | 2012/04/22 03:47 AM |
Bulldozer's Oddities. | EduardoS | 2012/04/22 01:57 PM |
Bulldozer's Oddities. | mpx | 2012/04/23 07:04 AM |
Bulldozer's Oddities. | Eric | 2012/04/23 12:33 PM |
Bulldozer's Oddities. | EduardoS | 2012/04/23 02:22 PM |
Bulldozer's Oddities. | Eric | 2012/04/23 07:30 PM |
Bulldozer's Oddities. | hcl64 | 2012/04/27 06:16 PM |
Bulldozer's Oddities. | Y | 2012/04/25 04:34 AM |
Bulldozer's IDIV | Heikki Kultala | 2012/04/27 10:56 PM |
Bulldozer's IDIV | Y | 2012/04/30 01:51 AM |
Bulldozer's IDIV | EduardoS | 2012/04/30 05:39 AM |
Bulldozer's IDIV | P3Dnow | 2012/05/08 01:23 AM |
Bulldozer's IDIV | Exophase | 2012/05/08 07:37 AM |
Bulldozer's Oddities. | EduardoS | 2012/04/23 02:15 PM |
Clustered MT as SMT for high frequency | Paul A. Clayton | 2012/04/20 04:10 PM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 12:56 AM |
Clustered MT as SMT for high frequency | anonymous | 2012/04/28 01:43 AM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 02:59 PM |
Clustered MT as SMT for high frequency | anonymous | 2012/04/28 08:45 PM |
Clustered MT as SMT for high frequency | anon | 2012/04/28 02:13 AM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 03:23 PM |
Clustered MT as SMT for high frequency | anon | 2012/04/28 06:19 PM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 07:58 PM |
Clustered MT as SMT for high frequency | David Kanter | 2012/04/28 06:38 AM |
Guessed meaning of "strong dependency model" | Paul A. Clayton | 2012/04/28 07:24 AM |
Guessed meaning of "strong dependency model" | EduardoS | 2012/04/28 09:46 AM |
*Right meaning* about "strong dependency model" | hcl64 | 2012/04/28 04:59 PM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 04:24 PM |
Clustered MT as SMT for high frequency | anonymous | 2012/04/28 08:50 PM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 09:47 PM |
SNB width | David Kanter | 2012/04/28 09:48 PM |
SNB width | hcl64 | 2012/04/29 02:24 AM |
Clustered MT as SMT for high frequency | David Kanter | 2012/04/28 09:56 PM |
Clustered MT as SMT for high frequency | hcl64 | 2012/04/28 11:44 PM |
SOI, FD vs. PD | David Kanter | 2012/04/29 07:19 AM |
SOI, FD vs. PD | hcl64 | 2012/04/29 05:31 PM |
SOI, FD vs. PD | David Kanter | 2012/04/29 11:26 PM |
SOI, FD vs. PD | hcl64 | 2012/04/30 08:08 AM |
SOI, FD vs. PD | David Kanter | 2012/04/30 09:59 AM |
SOI, FD vs. PD | hcl64 | 2012/04/30 06:10 PM |
SOI, FD vs. PD | David Kanter | 2012/04/30 06:32 PM |
SOI, FD vs. PD | hcl64 | 2012/04/30 10:47 PM |
SOI, FD vs. PD | David Kanter | 2012/05/01 02:24 AM |
SOI, FD vs. PD | hcl64 | 2012/05/01 05:46 AM |
SOI, FD vs. PD | hcl64 | 2012/05/01 06:37 AM |
SOI, FD vs. PD | David Kanter | 2012/05/01 08:19 AM |
SOI, FD vs. PD | hcl64 | 2012/05/01 07:39 AM |
PD-SOI | David Kanter | 2012/05/02 12:22 PM |
SOI, FD vs. PD | slacker | 2012/04/30 08:10 PM |
SOI, FD vs. PD | David Kanter | 2012/04/30 10:16 PM |
SOI, FD vs. PD | slacker | 2012/05/01 10:04 PM |
SOI, FD vs. PD | David Kanter | 2012/05/02 08:19 AM |
SOI, FD vs. PD | zou | 2012/05/02 12:23 PM |
Previous discussion of clustered MT | Paul A. Clayton | 2012/04/28 07:00 AM |
Previous discussion of clustered MT | hcl64 | 2012/04/28 09:38 PM |
Previous discussion of clustered MT | David Kanter | 2012/04/30 04:37 PM |
Previous discussion of clustered MT | hcl64 | 2012/04/30 07:24 PM |
Previous discussion of clustered MT | David Kanter | 2012/04/30 07:40 PM |
Previous discussion of clustered MT | hcl64 | 2012/05/01 09:15 AM |
Latency issues | David Kanter | 2012/05/02 12:01 PM |
So, what do people think of these numbers> | Megol | 2012/04/21 01:57 AM |