By: Ricardo B (ricardo.b.delete@this.xxxxx.xx), May 28, 2013 2:58 pm
Room: Moderated Discussions
Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 28, 2013 12:45 pm wrote:
> David Kanter (dkanter.delete@this.realworldtech.com) on May 28, 2013 12:14 pm wrote:
> > Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 28, 2013 9:00 am wrote:
> > >
> > > If Silvermont uses RSV banks for every execution unit function, and the execution units
> > > simply "sucks" in the next instruction to execute as the previous one is finished, what
> > > is there to prevent dependancy errors? Sorry, I'm quite a bit new to CPU architecture
> > > and want to learn lots, so I apologize if this is a very newbish question.
> >
> > That's a good question. Reservation stations perform dependency checking:
> >
> > "...each distributed scheduler will dispatch the oldest, ready to execute µop to the appropriate port."
> >
> > http://www.realworldtech.com/silvermont/5/
> >
> > So if the oldest instruction is still waiting on a register, then the next oldest
> > will be chosen. If all 8 entries are waiting, then nothing is sent.
> >
> > The memory RSV is a little different though.
> >
> > David
>
> Ah, so I guess the scheduler only dispatches instructions that it knows it can complete to the RSV. Good to
> know, this seems like itd be very good for parallelism to be able to keep as many ALUs functions busy as possible
No, instructions are sent to the reservation station in program order, as soon as there is a free slot in the reservation station.
It's the reservation station who checks for dependencies and, when it notices one of the instruction it's holding had its dependencies resolved, sends it for execution.
> instead of relying on one unified scheduler to release instructions through a limited number of ports... Why
> does this design seem better than the design in Haswell? Maybe its me not thinking straight.
Each Saltwell reservation station can only hold certain types of instructions and only 8 of them (6 for load/store).
Unless the instruction type and dependency mix is just right (which mostly, it won't be), Saltwell will easily reach situations where:
a) some reservation stations will go mostly unused (ie, FP stations in a integer mostly program)
b) one of the stations fills up and generates back-pressure into the common path (rename buffer, re-order buffer), stopping instructions from flowing into the other, not yet full, reservation stations
c) instructions without dependency which could be executed exist, but are just outside the small 8 instruction look ahead window.
Haswell's unified scheduler provides a 60 instruction look ahead window and does for almost any instruction mix (short of pathological/artificial cases).
It's far more robust, performance wise.
Of course, it's physical implementation also has to be far more complex. Thus, Intel avoiding it for Saltwell.
>
> If you dont mind, I have another, more general question about
> CPU architecture, I hope you dont mind all my questions.
>
> I know that a CPU will be fed instructions as it gets copied to RAM, to L3, to L2, to L1, and finally
> into the registers as it performs the instruction on the data, but my question is; how does the
> data make its way from RAM to register? Surely it isnt pumped through the scheduler, right?
Though the load unit, of course.
> David Kanter (dkanter.delete@this.realworldtech.com) on May 28, 2013 12:14 pm wrote:
> > Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 28, 2013 9:00 am wrote:
> > >
> > > If Silvermont uses RSV banks for every execution unit function, and the execution units
> > > simply "sucks" in the next instruction to execute as the previous one is finished, what
> > > is there to prevent dependancy errors? Sorry, I'm quite a bit new to CPU architecture
> > > and want to learn lots, so I apologize if this is a very newbish question.
> >
> > That's a good question. Reservation stations perform dependency checking:
> >
> > "...each distributed scheduler will dispatch the oldest, ready to execute µop to the appropriate port."
> >
> > http://www.realworldtech.com/silvermont/5/
> >
> > So if the oldest instruction is still waiting on a register, then the next oldest
> > will be chosen. If all 8 entries are waiting, then nothing is sent.
> >
> > The memory RSV is a little different though.
> >
> > David
>
> Ah, so I guess the scheduler only dispatches instructions that it knows it can complete to the RSV. Good to
> know, this seems like itd be very good for parallelism to be able to keep as many ALUs functions busy as possible
No, instructions are sent to the reservation station in program order, as soon as there is a free slot in the reservation station.
It's the reservation station who checks for dependencies and, when it notices one of the instruction it's holding had its dependencies resolved, sends it for execution.
> instead of relying on one unified scheduler to release instructions through a limited number of ports... Why
> does this design seem better than the design in Haswell? Maybe its me not thinking straight.
Each Saltwell reservation station can only hold certain types of instructions and only 8 of them (6 for load/store).
Unless the instruction type and dependency mix is just right (which mostly, it won't be), Saltwell will easily reach situations where:
a) some reservation stations will go mostly unused (ie, FP stations in a integer mostly program)
b) one of the stations fills up and generates back-pressure into the common path (rename buffer, re-order buffer), stopping instructions from flowing into the other, not yet full, reservation stations
c) instructions without dependency which could be executed exist, but are just outside the small 8 instruction look ahead window.
Haswell's unified scheduler provides a 60 instruction look ahead window and does for almost any instruction mix (short of pathological/artificial cases).
It's far more robust, performance wise.
Of course, it's physical implementation also has to be far more complex. Thus, Intel avoiding it for Saltwell.
>
> If you dont mind, I have another, more general question about
> CPU architecture, I hope you dont mind all my questions.
>
> I know that a CPU will be fed instructions as it gets copied to RAM, to L3, to L2, to L1, and finally
> into the registers as it performs the instruction on the data, but my question is; how does the
> data make its way from RAM to register? Surely it isnt pumped through the scheduler, right?
Though the load unit, of course.