By: David Kanter (dkanter.delete@this.realworldtech.com), May 22, 2007 12:11 am
Room: Moderated Discussions
dess (dess@nospam.com) on 5/21/07 wrote:
---------------------------
>David Kanter (dkanter@realworldtech.com) on 5/21/07 wrote:
>---------------------------
>>In talking with architects, some use the terminology micro-op to refer to what
>>AMD used to call RISCops (Rops), but are most accurately described as internal operations.
>>Some of their documentation refers to macro-ops as the internal operations. I
>>don't really think it matters. I choose the term that makes it easiest for the
>>folks at AMD to edit/revise/comment on my article and >>questions.
>Well, it matters if you use the same term for both AMD's >and Intel's chips, in
>the same context, while it refers to different things. ;)
Unfortunately, I couldn't quite get macro-op, or micro-op to really fit well in the diagrams. We have a 750p wide constraint, and anything larger will cause clipping problems. Hence I prefer uop to micro-op, despite confusion potential. Perhaps I should update the article.
>>However, Intel is a different story. First of all, the >term is uop, not micro-op,
>
>You used "uop" also on Barcelona's diagrams.
See above comments.
>>which refers to the internal operations. In Intel terminology a macro-op is an
>>x86 instruction. So macro-op fusion is the fusing of two x86 instruction (CMP+JMP).
>>
>>>Now I'm confused whether this happened at some places for >the Core 2 as well, or
>>>those are really 4 micro-ops, in which case the Barcelona >is the wider one.
>>
>>Decoding always maps the contents of the Icache (x86 instructions) to the internal
>>operations and operands of the processor in question.
>>
>>AMD has 3 decoders, Intel has 4 decoders. AMD can 3 operations/cycle, Intel can
>>rename 4. Counting execution units is a little tricky since AMD likes to count
>>their separate AGUs as functional units, while Intel considers them part of the
>>memory pipelines. Intel can retire 4 operations, AMD can retire 3.
>>
>>At the end of the day, the Core2 is theoretically wider. However, it's unclear
>>to what extent this will be true in practice.
>>
>>DK
>
>Taking into account that AMD's 3 decoders are responsible >of decoding 95% of the
>instruction set/modes
According to whom, and on what workload?
>(the uCode Engine works with the >remaining 5%,
Again, which workload?
>as one more
>decoder, but it won't work parallel with the other 3), and >emit up to 2 micro-ops
>each, while 3 of Intel's 4 decoders are for more simple >instructions, and so emit
>one uop, Intel's is wider only in case of more simple >instructions are being used.
It seems to me that you probably don't have data on the % of operations that decode to 1,2,3,4+ uops.
>Moreover, AFAIK, AMD's Pack Buffer emits 3 _macro-ops_, >whose can contain up to
>6 micro-ops, while Intel's path is 4 _uop_. Or is this >"uop" a macro-op in AMD's terms?
Intel's uops and AMD's micro-ops, macro-ops, etc. are not really comparable directly. IMHO, the best way to examine the two is statistical analysis...
DK
---------------------------
>David Kanter (dkanter@realworldtech.com) on 5/21/07 wrote:
>---------------------------
>>In talking with architects, some use the terminology micro-op to refer to what
>>AMD used to call RISCops (Rops), but are most accurately described as internal operations.
>>Some of their documentation refers to macro-ops as the internal operations. I
>>don't really think it matters. I choose the term that makes it easiest for the
>>folks at AMD to edit/revise/comment on my article and >>questions.
>Well, it matters if you use the same term for both AMD's >and Intel's chips, in
>the same context, while it refers to different things. ;)
Unfortunately, I couldn't quite get macro-op, or micro-op to really fit well in the diagrams. We have a 750p wide constraint, and anything larger will cause clipping problems. Hence I prefer uop to micro-op, despite confusion potential. Perhaps I should update the article.
>>However, Intel is a different story. First of all, the >term is uop, not micro-op,
>
>You used "uop" also on Barcelona's diagrams.
See above comments.
>>which refers to the internal operations. In Intel terminology a macro-op is an
>>x86 instruction. So macro-op fusion is the fusing of two x86 instruction (CMP+JMP).
>>
>>>Now I'm confused whether this happened at some places for >the Core 2 as well, or
>>>those are really 4 micro-ops, in which case the Barcelona >is the wider one.
>>
>>Decoding always maps the contents of the Icache (x86 instructions) to the internal
>>operations and operands of the processor in question.
>>
>>AMD has 3 decoders, Intel has 4 decoders. AMD can 3 operations/cycle, Intel can
>>rename 4. Counting execution units is a little tricky since AMD likes to count
>>their separate AGUs as functional units, while Intel considers them part of the
>>memory pipelines. Intel can retire 4 operations, AMD can retire 3.
>>
>>At the end of the day, the Core2 is theoretically wider. However, it's unclear
>>to what extent this will be true in practice.
>>
>>DK
>
>Taking into account that AMD's 3 decoders are responsible >of decoding 95% of the
>instruction set/modes
According to whom, and on what workload?
>(the uCode Engine works with the >remaining 5%,
Again, which workload?
>as one more
>decoder, but it won't work parallel with the other 3), and >emit up to 2 micro-ops
>each, while 3 of Intel's 4 decoders are for more simple >instructions, and so emit
>one uop, Intel's is wider only in case of more simple >instructions are being used.
It seems to me that you probably don't have data on the % of operations that decode to 1,2,3,4+ uops.
>Moreover, AFAIK, AMD's Pack Buffer emits 3 _macro-ops_, >whose can contain up to
>6 micro-ops, while Intel's path is 4 _uop_. Or is this >"uop" a macro-op in AMD's terms?
Intel's uops and AMD's micro-ops, macro-ops, etc. are not really comparable directly. IMHO, the best way to examine the two is statistical analysis...
DK
Topic | Posted By | Date |
---|---|---|
Barcelona Article Online | David Kanter | 2007/05/16 03:20 AM |
Barcelona Article Online | PiedPiper | 2007/05/16 05:12 AM |
Yes, I left out a sentence there. Fixed (NT) | David Kanter | 2007/05/16 12:07 PM |
Barcelona Article Online | anonymous | 2007/05/16 06:01 AM |
Barcelona Article Online | Anonymous | 2007/05/16 06:28 PM |
Barcelona Article Online | anonymous | 2007/05/16 07:52 PM |
Barcelona Article Online | Anonymous1 | 2007/05/16 07:08 AM |
Barcelona Article Online | Dean M | 2007/05/16 11:09 AM |
Barcelona Article Online | David Kanter | 2007/05/16 12:38 PM |
Barcelona Article Online | Dean M | 2007/05/16 02:10 PM |
Barcelona Article Online | IntelUser2000 | 2007/05/16 02:59 PM |
Barcelona Article Online | Linus Torvalds | 2007/05/16 03:24 PM |
Barcelona Article Online | David Kanter | 2007/05/16 04:57 PM |
Barcelona Article Online | Michael S | 2007/05/17 05:07 AM |
Barcelona Article Online | IntelUser2000 | 2007/05/18 08:58 PM |
8 socket servers | Doug Siebert | 2007/05/16 04:58 PM |
8 socket servers | Michael S | 2007/05/17 05:20 AM |
8 socket servers | Joe Chang | 2007/05/17 07:38 AM |
8 socket servers | Alex Jones | 2007/05/17 09:35 AM |
8 socket servers | Jose | 2007/05/23 08:23 AM |
8 socket servers | Michael S | 2007/05/23 11:37 AM |
8 socket servers | anonymous | 2007/05/26 03:49 PM |
8 socket servers | Joe Chang | 2007/05/27 01:46 PM |
8 socket servers | Doug Siebert | 2007/05/23 09:56 PM |
8 socket servers | Joe Chang | 2007/05/24 04:33 AM |
8 socket servers | Anonymous | 2007/05/24 11:18 AM |
8 socket servers | Doug Siebert | 2007/05/24 10:47 PM |
8 socket servers | Linus Torvalds | 2007/05/25 10:35 AM |
8 socket servers | Nick | 2007/05/25 02:29 AM |
Performance estimation seems odd | Hotar | 2007/05/17 01:54 AM |
Performance estimation seems odd | David Kanter | 2007/05/17 08:38 AM |
microops vs macroops on page 4 | Peter Lund | 2007/05/17 12:04 PM |
microops vs macroops on page 4 | David Kanter | 2007/05/21 04:51 PM |
microops vs macroops on page 4 | EduardoS | 2007/05/21 05:42 PM |
microops vs macroops on page 4 | dess | 2007/05/21 07:00 PM |
Barcelona Article Online | Peter Lund | 2007/05/17 12:25 PM |
macro-op vs. micro-op | dess | 2007/05/21 07:24 AM |
macro-op vs. micro-op | David Kanter | 2007/05/21 04:38 PM |
macro-op vs. micro-op | dess | 2007/05/21 06:15 PM |
macro-op vs. micro-op | David Kanter | 2007/05/22 12:11 AM |
macro-op vs. micro-op | dess | 2007/05/22 03:56 AM |
macro-op vs. micro-op | Gipsel | 2007/05/22 05:05 AM |
macro-op vs. micro-op | dess | 2007/05/22 05:52 AM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:14 AM |
macro-op vs. micro-op | dess | 2007/05/22 06:44 AM |
macro-op vs. micro-op | EduardoS | 2007/05/22 02:19 PM |
macro-op vs. micro-op | dess | 2007/05/24 08:52 AM |
Stop comparing apples to oranges | EduardoS | 2007/05/22 02:30 PM |
Stop comparing apples to oranges | dess | 2007/05/22 04:09 PM |
Stop comparing apples to oranges | dess | 2007/05/22 04:30 PM |
Stop comparing apples to oranges | EduardoS | 2007/05/22 04:31 PM |
Stop comparing... apples to oranges? | dess | 2007/05/24 09:30 AM |
Stop comparing apples to oranges | anonymous | 2007/05/22 08:12 PM |
Stop comparing apples to oranges | EduardoS | 2007/05/23 02:50 PM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:08 AM |
macro-op vs. micro-op | dess | 2007/05/22 06:40 AM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:48 AM |
macro-op vs. micro-op | dess | 2007/05/21 08:30 PM |
macro-op vs. micro-op | anonymous | 2007/05/22 06:44 AM |
macro-op vs. micro-op | dess | 2007/05/24 09:38 AM |
macro-op vs. micro-op | Michael S | 2007/05/22 05:26 AM |