By: David Kanter (dkanter.delete@this.realworldtech.com), May 16, 2007 12:38 pm
Room: Moderated Discussions
Dean M (dean.m8@gmail.com) on 5/16/07 wrote:
---------------------------
>Great read. Thanks.
>
>Just a few things:
>
>"...16 XMM registers, which actually occupy 2 slots, since >they are 128 bits wide."
>
>This is true for the K8, but not for Barcelona.
>From the AMD 10h optimization guide:
>
>"A floating-point XMM register is viewed as one 128-bit >register internal to the processor.
>Current generation processors cannot write a 64-bit half of >a 128-bit XMM register
>without having a merge dependency on the other 64-bit >half."
Interesting, thanks for pointing that out. I'll update my article as such.
>Also, the K8/Barcelona diagrams would seem to suggest that >data is stored from
>LSU1 and loaded to LSU2, but that's not the case as you >yourself have written in the article.
>Stores are written (after retirement) from LSU2 >exclusively. While loads can be
>initialized from either LSU1 or LSU2, the actual data is >passed directly to the
>IFFRF (and can be cough by a reservation station or ALU on >the way there).
So my diagrams did take a little liberty. I am extremely constrained by the available space that I have. My diagrams must be 750p wide, no more, and the Barcelona one was already pushing up against the limits to the point where at the end I had to clip out 16 pixels from the L2 and L3 cache basically to get it to fit.
The diagram is meant to indicate that the 64 bit addresses go from LSU1-->cache tags. 128 bit data comes back from the cache, and 64 bit data goes in. Unfortunately, I didn't have enough room to stick in address, read data and write data lines, without things getting ugly. I'd call it an unfortunate artistic compromise.
>From the same diagram: FPMISC should present its data >directly to LSU1, instead of going through a reservation >station.
Isn't FPMISC also used for integer<-->FP conversions? I was trying to make the point that the FPMISC pipeline is used to get data between the two functional unit clusters, rather than FPSTORE.
>And a typo: 44 Entry Integer Future File – should be 40.
You're right!
DK
---------------------------
>Great read. Thanks.
>
>Just a few things:
>
>"...16 XMM registers, which actually occupy 2 slots, since >they are 128 bits wide."
>
>This is true for the K8, but not for Barcelona.
>From the AMD 10h optimization guide:
>
>"A floating-point XMM register is viewed as one 128-bit >register internal to the processor.
>Current generation processors cannot write a 64-bit half of >a 128-bit XMM register
>without having a merge dependency on the other 64-bit >half."
Interesting, thanks for pointing that out. I'll update my article as such.
>Also, the K8/Barcelona diagrams would seem to suggest that >data is stored from
>LSU1 and loaded to LSU2, but that's not the case as you >yourself have written in the article.
>Stores are written (after retirement) from LSU2 >exclusively. While loads can be
>initialized from either LSU1 or LSU2, the actual data is >passed directly to the
>IFFRF (and can be cough by a reservation station or ALU on >the way there).
So my diagrams did take a little liberty. I am extremely constrained by the available space that I have. My diagrams must be 750p wide, no more, and the Barcelona one was already pushing up against the limits to the point where at the end I had to clip out 16 pixels from the L2 and L3 cache basically to get it to fit.
The diagram is meant to indicate that the 64 bit addresses go from LSU1-->cache tags. 128 bit data comes back from the cache, and 64 bit data goes in. Unfortunately, I didn't have enough room to stick in address, read data and write data lines, without things getting ugly. I'd call it an unfortunate artistic compromise.
>From the same diagram: FPMISC should present its data >directly to LSU1, instead of going through a reservation >station.
Isn't FPMISC also used for integer<-->FP conversions? I was trying to make the point that the FPMISC pipeline is used to get data between the two functional unit clusters, rather than FPSTORE.
>And a typo: 44 Entry Integer Future File – should be 40.
You're right!
DK
Topic | Posted By | Date |
---|---|---|
Barcelona Article Online | David Kanter | 2007/05/16 03:20 AM |
Barcelona Article Online | PiedPiper | 2007/05/16 05:12 AM |
Yes, I left out a sentence there. Fixed (NT) | David Kanter | 2007/05/16 12:07 PM |
Barcelona Article Online | anonymous | 2007/05/16 06:01 AM |
Barcelona Article Online | Anonymous | 2007/05/16 06:28 PM |
Barcelona Article Online | anonymous | 2007/05/16 07:52 PM |
Barcelona Article Online | Anonymous1 | 2007/05/16 07:08 AM |
Barcelona Article Online | Dean M | 2007/05/16 11:09 AM |
Barcelona Article Online | David Kanter | 2007/05/16 12:38 PM |
Barcelona Article Online | Dean M | 2007/05/16 02:10 PM |
Barcelona Article Online | IntelUser2000 | 2007/05/16 02:59 PM |
Barcelona Article Online | Linus Torvalds | 2007/05/16 03:24 PM |
Barcelona Article Online | David Kanter | 2007/05/16 04:57 PM |
Barcelona Article Online | Michael S | 2007/05/17 05:07 AM |
Barcelona Article Online | IntelUser2000 | 2007/05/18 08:58 PM |
8 socket servers | Doug Siebert | 2007/05/16 04:58 PM |
8 socket servers | Michael S | 2007/05/17 05:20 AM |
8 socket servers | Joe Chang | 2007/05/17 07:38 AM |
8 socket servers | Alex Jones | 2007/05/17 09:35 AM |
8 socket servers | Jose | 2007/05/23 08:23 AM |
8 socket servers | Michael S | 2007/05/23 11:37 AM |
8 socket servers | anonymous | 2007/05/26 03:49 PM |
8 socket servers | Joe Chang | 2007/05/27 01:46 PM |
8 socket servers | Doug Siebert | 2007/05/23 09:56 PM |
8 socket servers | Joe Chang | 2007/05/24 04:33 AM |
8 socket servers | Anonymous | 2007/05/24 11:18 AM |
8 socket servers | Doug Siebert | 2007/05/24 10:47 PM |
8 socket servers | Linus Torvalds | 2007/05/25 10:35 AM |
8 socket servers | Nick | 2007/05/25 02:29 AM |
Performance estimation seems odd | Hotar | 2007/05/17 01:54 AM |
Performance estimation seems odd | David Kanter | 2007/05/17 08:38 AM |
microops vs macroops on page 4 | Peter Lund | 2007/05/17 12:04 PM |
microops vs macroops on page 4 | David Kanter | 2007/05/21 04:51 PM |
microops vs macroops on page 4 | EduardoS | 2007/05/21 05:42 PM |
microops vs macroops on page 4 | dess | 2007/05/21 07:00 PM |
Barcelona Article Online | Peter Lund | 2007/05/17 12:25 PM |
macro-op vs. micro-op | dess | 2007/05/21 07:24 AM |
macro-op vs. micro-op | David Kanter | 2007/05/21 04:38 PM |
macro-op vs. micro-op | dess | 2007/05/21 06:15 PM |
macro-op vs. micro-op | David Kanter | 2007/05/22 12:11 AM |
macro-op vs. micro-op | dess | 2007/05/22 03:56 AM |
macro-op vs. micro-op | Gipsel | 2007/05/22 05:05 AM |
macro-op vs. micro-op | dess | 2007/05/22 05:52 AM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:14 AM |
macro-op vs. micro-op | dess | 2007/05/22 06:44 AM |
macro-op vs. micro-op | EduardoS | 2007/05/22 02:19 PM |
macro-op vs. micro-op | dess | 2007/05/24 08:52 AM |
Stop comparing apples to oranges | EduardoS | 2007/05/22 02:30 PM |
Stop comparing apples to oranges | dess | 2007/05/22 04:09 PM |
Stop comparing apples to oranges | dess | 2007/05/22 04:30 PM |
Stop comparing apples to oranges | EduardoS | 2007/05/22 04:31 PM |
Stop comparing... apples to oranges? | dess | 2007/05/24 09:30 AM |
Stop comparing apples to oranges | anonymous | 2007/05/22 08:12 PM |
Stop comparing apples to oranges | EduardoS | 2007/05/23 02:50 PM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:08 AM |
macro-op vs. micro-op | dess | 2007/05/22 06:40 AM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:48 AM |
macro-op vs. micro-op | dess | 2007/05/21 08:30 PM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:44 AM |
macro-op vs. micro-op | dess | 2007/05/24 09:38 AM |
macro-op vs. micro-op | Michael S | 2007/05/22 05:26 AM |