Article: AMD's Mobile Strategy
By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), December 17, 2011 12:06 pm
Room: Moderated Discussions
mpx (mpx@nomail.pl) on 12/17/11 wrote:
---------------------------
>Linus Torvalds (torvalds@linux-foundation.org) on 12/16/11 wrote:
>---------------------------
>>mpx (mpx@nomail.pl) on 12/16/11 wrote:
>>>
>>>Bulldozer has the greatest x86 decoder ever built. So what?
>>
>>Actually, bulldozer has one of the worst ones.
>
>? Can you find any other x86 decoder that can decode 4-operand instructions @ 4.2GHz
>while taking advantage of 32B prefetch?
Did you read my statement?
The problem isn't that it's powerful - the problem is that
it is shared. So *effectively* it has much lower throughput.
Of course, that is also true of Intel SMT, but at least
in that case it was never sold as multi-core.
So Sandybridge is relatively more powerful - and not just
because of the uop cache.
Sandybridge can also decode up to 4 instructions per cycle,
and while the exact rules for "up to" may differ between
the two, it's not all that obvious that they differ in
favor of Bulldozer.
But a dual-core Sandybridge will decode up to 8 insns
per clock, while what AMD calls a dual-core Bulldozer will
still be decoding just four.
There is no way anybody sane calls Bulldozer more capable,
unless they are trying to call the BD module a "core", and
the dual cores "SMT". And even if you do start
playing those kinds of games, it's not at all obvious that
you'd call Bulldozer stronger.
But AMD really calls it a "module" and the independent cores
are "cores". So per-core, the BD decoder is clearly weaker,
just going by AMD's own semantics.
Only if you do end up comparing the shared front-end against
a single sandybridge front-end, AMD may win something (for
large instructions), but loses a lot of other cases (the
whole macro fusion, uop cache etc that SB does).
IOW, you can possibly argue for BD being the more
capable front-end, but you really have to twist things and
limit the specifics to one particular way of looking at it
to do so - and use terms that even AMD doesn't use.
Most other ways, SB wins.
Linus
---------------------------
>Linus Torvalds (torvalds@linux-foundation.org) on 12/16/11 wrote:
>---------------------------
>>mpx (mpx@nomail.pl) on 12/16/11 wrote:
>>>
>>>Bulldozer has the greatest x86 decoder ever built. So what?
>>
>>Actually, bulldozer has one of the worst ones.
>
>? Can you find any other x86 decoder that can decode 4-operand instructions @ 4.2GHz
>while taking advantage of 32B prefetch?
Did you read my statement?
The problem isn't that it's powerful - the problem is that
it is shared. So *effectively* it has much lower throughput.
Of course, that is also true of Intel SMT, but at least
in that case it was never sold as multi-core.
So Sandybridge is relatively more powerful - and not just
because of the uop cache.
Sandybridge can also decode up to 4 instructions per cycle,
and while the exact rules for "up to" may differ between
the two, it's not all that obvious that they differ in
favor of Bulldozer.
But a dual-core Sandybridge will decode up to 8 insns
per clock, while what AMD calls a dual-core Bulldozer will
still be decoding just four.
There is no way anybody sane calls Bulldozer more capable,
unless they are trying to call the BD module a "core", and
the dual cores "SMT". And even if you do start
playing those kinds of games, it's not at all obvious that
you'd call Bulldozer stronger.
But AMD really calls it a "module" and the independent cores
are "cores". So per-core, the BD decoder is clearly weaker,
just going by AMD's own semantics.
Only if you do end up comparing the shared front-end against
a single sandybridge front-end, AMD may win something (for
large instructions), but loses a lot of other cases (the
whole macro fusion, uop cache etc that SB does).
IOW, you can possibly argue for BD being the more
capable front-end, but you really have to twist things and
limit the specifics to one particular way of looking at it
to do so - and use terms that even AMD doesn't use.
Most other ways, SB wins.
Linus