By: Michael S (already5chosen.delete@this.yahoo.com), August 10, 2014 11:30 pm
Room: Moderated Discussions
anon (anon.delete@this.anon.com) on August 10, 2014 5:27 pm wrote:
> Michael S (already5chosen.delete@this.yahoo.com) on August 10, 2014 3:11 am wrote:
> > anon (anon.delete@this.anon.com) on August 9, 2014 12:29 am wrote:
> > >
> > > The big Intel cores use significant complexity to tackle the problem and they're stuck
> > > at 4. POWER has reached 8 without problems (with almost certainly better throughput/watt
> > > on its target workloads).
> >
> > "almost certainly" is way to strong a statement. It's possible, yes. But so far we have zero evidence.
>
> We have non-zero evidence. Not complete, but there is evidence.
>
If I am not mistaken, all we have now are very impressive 4x6-core Power8 SAP SD 2-tier scores that still lose in absolute numbers to 4x15-core Intel and approximately matches die-for-die 16x16-core Fujitsu.
We don't know which system between the three consumes less power under load, not even approximately.
> >
> > > Not that this is attributable to decoder alone or x86 tax
> > > at all necessarily, but just to head off any claim of it being a furnace.
> > >
> > > I don't know what you mean by "tracking dependencies++", but there is
> > > no indication that POWER8 uses a uop cache, so you're simply wrong.
> > >
> >
> > Tracking dependencies withing group of instructions that
> > are renamed in parallel. Conventional wisdom says that
> > it has complexity of O(width^2). May be there was algorithmic breakthrough in this area, I don't know...
>
> That has nothing to do with decoding stage, however.
>
The context was practical limits of the width of in-order front end of OoO cores.
> Michael S (already5chosen.delete@this.yahoo.com) on August 10, 2014 3:11 am wrote:
> > anon (anon.delete@this.anon.com) on August 9, 2014 12:29 am wrote:
> > >
> > > The big Intel cores use significant complexity to tackle the problem and they're stuck
> > > at 4. POWER has reached 8 without problems (with almost certainly better throughput/watt
> > > on its target workloads).
> >
> > "almost certainly" is way to strong a statement. It's possible, yes. But so far we have zero evidence.
>
> We have non-zero evidence. Not complete, but there is evidence.
>
If I am not mistaken, all we have now are very impressive 4x6-core Power8 SAP SD 2-tier scores that still lose in absolute numbers to 4x15-core Intel and approximately matches die-for-die 16x16-core Fujitsu.
We don't know which system between the three consumes less power under load, not even approximately.
> >
> > > Not that this is attributable to decoder alone or x86 tax
> > > at all necessarily, but just to head off any claim of it being a furnace.
> > >
> > > I don't know what you mean by "tracking dependencies++", but there is
> > > no indication that POWER8 uses a uop cache, so you're simply wrong.
> > >
> >
> > Tracking dependencies withing group of instructions that
> > are renamed in parallel. Conventional wisdom says that
> > it has complexity of O(width^2). May be there was algorithmic breakthrough in this area, I don't know...
>
> That has nothing to do with decoding stage, however.
>
The context was practical limits of the width of in-order front end of OoO cores.