By: anon (spam.delete.delete@this.this.spam.com), August 25, 2018 4:06 am
Room: Moderated Discussions
Travis (travis.downs.delete@this.gmail.com) on August 24, 2018 10:54 pm wrote:
> David Hess (davidwhess.delete@this.gmail.com) on August 24, 2018 10:08 pm wrote:
> > Ricardo B (ricardo.b.delete@this.xxxxx.xx) on August 24, 2018 5:22 am wrote:
> > >
> > > Naively, I would say that the best approach for dealing with power hungry instructions
> > > would be to run at normal clock speed and restrict the instruction issue rate until
> > > some criteria are met and then down clock the core and un-restrict the issue rate.
> >
> > Running at a slower clock would allow the extra complex instructions
> > to take advantage of more gate delays per stage.
>
> Yes, but only if they only ran at those slower speeds, right? Here it seems you
> can run the most complex instructions (full width FMAs) at a higher speed ("middle
> tier"), but just not at a high rate (but they have the expected latency).
>
> So the ALU must be designed to accommodate the gate delay associated with that higher
> frequency, unless it can somehow reconfigure itself when running at a lower freq?
>
>
Some AVX-512 only instructions could rely on it.
The ALU might also be asymmetric with the lowest 128b having the lowest number of gate delays while the higher 128b and highest 256b only need to meet more relaxed timing requirements.
Since the high bits can be powergated they're on a separate supply anyway so in theory different voltages would be an option as well.
> David Hess (davidwhess.delete@this.gmail.com) on August 24, 2018 10:08 pm wrote:
> > Ricardo B (ricardo.b.delete@this.xxxxx.xx) on August 24, 2018 5:22 am wrote:
> > >
> > > Naively, I would say that the best approach for dealing with power hungry instructions
> > > would be to run at normal clock speed and restrict the instruction issue rate until
> > > some criteria are met and then down clock the core and un-restrict the issue rate.
> >
> > Running at a slower clock would allow the extra complex instructions
> > to take advantage of more gate delays per stage.
>
> Yes, but only if they only ran at those slower speeds, right? Here it seems you
> can run the most complex instructions (full width FMAs) at a higher speed ("middle
> tier"), but just not at a high rate (but they have the expected latency).
>
> So the ALU must be designed to accommodate the gate delay associated with that higher
> frequency, unless it can somehow reconfigure itself when running at a lower freq?
>
>
Some AVX-512 only instructions could rely on it.
The ALU might also be asymmetric with the lowest 128b having the lowest number of gate delays while the higher 128b and highest 256b only need to meet more relaxed timing requirements.
Since the high bits can be powergated they're on a separate supply anyway so in theory different voltages would be an option as well.