By: Heikki kultala (heikki.kultala.delete@this.gmail.com), May 2, 2017 1:30 pm
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on May 2, 2017 1:29 am wrote:
> Heikki kultala (heikki.kultala.delete@this.gmail.com) on May 1, 2017 8:27 pm wrote:
> > Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on May 1, 2017 4:42 am wrote:
> >
> > > To improve performance you always need to spend more transistors and power.
> > > However the little core is much smaller and efficient to start with, so there
> > > is plenty of scope to improve performance while keeping it efficient.
> >
> > Like adding OoOE, which is typically much more power-efficient than trying to go wider
> > or much longer pipeline, or having very fast and big caches and wide memory buses.
>
> That would mean 3x the power and 3x the area for ~65% extra performance. That does not
> seem like a good tradeoff if your goal is to remain small and power efficient...
I was talking about ONLY adding OoOE, NOT:
* Increasing pipeline length by over 1.5 times
* Increasing fetch width by 2x
* Increasing decode width by 1.5x
* Doubling the amount of load/store units
* Doubling the SIMD unit width
* Doubling the SIMD unit count
* Doing zillion other changes
You might (olmost) get your 3x when you all of the ones I mentioned here, in addition to the OoOE,
and even that 3x is from a marketing propaganda.
But I was talking only about adding OoOE, NOT adding all those things.
> Heikki kultala (heikki.kultala.delete@this.gmail.com) on May 1, 2017 8:27 pm wrote:
> > Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on May 1, 2017 4:42 am wrote:
> >
> > > To improve performance you always need to spend more transistors and power.
> > > However the little core is much smaller and efficient to start with, so there
> > > is plenty of scope to improve performance while keeping it efficient.
> >
> > Like adding OoOE, which is typically much more power-efficient than trying to go wider
> > or much longer pipeline, or having very fast and big caches and wide memory buses.
>
> That would mean 3x the power and 3x the area for ~65% extra performance. That does not
> seem like a good tradeoff if your goal is to remain small and power efficient...
I was talking about ONLY adding OoOE, NOT:
* Increasing pipeline length by over 1.5 times
* Increasing fetch width by 2x
* Increasing decode width by 1.5x
* Doubling the amount of load/store units
* Doubling the SIMD unit width
* Doubling the SIMD unit count
* Doing zillion other changes
You might (olmost) get your 3x when you all of the ones I mentioned here, in addition to the OoOE,
and even that 3x is from a marketing propaganda.
But I was talking only about adding OoOE, NOT adding all those things.