By: Michael S (already5chosen.delete@this.yahoo.com), May 7, 2013 4:49 am
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on May 7, 2013 5:37 am wrote:
> Michael S (already5chosen.delete@this.yahoo.com) on May 7, 2013 12:49 am wrote:
> > David Kanter (dkanter.delete@this.realworldtech.com) on May 6, 2013 5:48 pm wrote:
> > > Michael S (already5chosen.delete@this.yahoo.com) on May 6, 2013 3:59 pm wrote:
> > >
> > > > Thank you, David, good article.
> > > >
> > > > Two questions:
> > > > 1. How Silvermont handles load-op and load-op-store x86 instructions. Are they cracked
> > > > before ROB, consuming multiple ROB entries, or after ROBs, consuming just one entry?
> > >
> > > Instructions only take a single ROB entry, there is no cracking.
> >
> > First, a short rant.
> > [rant on]
> > You say there is no cracking, so, may be, they decided to call it fracking. But the
> > procedure, equivalent to cracking has to be here, we see the need for it in the structure
> > of pipeline and in fact that there is no buddy-ALU hiding near AGU.
> > I think, there are no reasons why we should not use established term, i.e. cracking.
> > [/rant off]
>
> Indeed you may need cracking for complex operations, and most definitely for microcoded instructions.
>
> > With a single ROB entry serving a whole x86 instruction, I wonder how further co-ordination
> > between memory part(s) and ALU part of complex instruction is going on. For example, we have
> > "add EAX, [EBX]". EBX value is available as well as AGU resources, but EAX is still unknown.
> > Will load uOP be issued for execution or will it wait for availability of EAX?
>
> If it isn't cracked in decode, it will receive a temporary register during renaming and cracked
> into two uops, likely when dispatching to reservation stations. So the LS unit sees a mov
> TMP, [EBX] and the integer unit gets add EAX, TMP. The OoO machinery does the rest.
>
That's how machines with cracking-before-ROB operate.
But would it still work for cracking-after-ROB?
Looks like you'll need bigger ROB - up to 3 register inputs + temporary that starts life as an output then became an input + one genuine output. Unless they found some simplifying trick it looks ugly.
> Michael S (already5chosen.delete@this.yahoo.com) on May 7, 2013 12:49 am wrote:
> > David Kanter (dkanter.delete@this.realworldtech.com) on May 6, 2013 5:48 pm wrote:
> > > Michael S (already5chosen.delete@this.yahoo.com) on May 6, 2013 3:59 pm wrote:
> > >
> > > > Thank you, David, good article.
> > > >
> > > > Two questions:
> > > > 1. How Silvermont handles load-op and load-op-store x86 instructions. Are they cracked
> > > > before ROB, consuming multiple ROB entries, or after ROBs, consuming just one entry?
> > >
> > > Instructions only take a single ROB entry, there is no cracking.
> >
> > First, a short rant.
> > [rant on]
> > You say there is no cracking, so, may be, they decided to call it fracking. But the
> > procedure, equivalent to cracking has to be here, we see the need for it in the structure
> > of pipeline and in fact that there is no buddy-ALU hiding near AGU.
> > I think, there are no reasons why we should not use established term, i.e. cracking.
> > [/rant off]
>
> Indeed you may need cracking for complex operations, and most definitely for microcoded instructions.
>
> > With a single ROB entry serving a whole x86 instruction, I wonder how further co-ordination
> > between memory part(s) and ALU part of complex instruction is going on. For example, we have
> > "add EAX, [EBX]". EBX value is available as well as AGU resources, but EAX is still unknown.
> > Will load uOP be issued for execution or will it wait for availability of EAX?
>
> If it isn't cracked in decode, it will receive a temporary register during renaming and cracked
> into two uops, likely when dispatching to reservation stations. So the LS unit sees a mov
> TMP, [EBX] and the integer unit gets add EAX, TMP. The OoO machinery does the rest.
>
That's how machines with cracking-before-ROB operate.
But would it still work for cracking-after-ROB?
Looks like you'll need bigger ROB - up to 3 register inputs + temporary that starts life as an output then became an input + one genuine output. Unless they found some simplifying trick it looks ugly.