By: anon (anon.delete@this.anon.com), August 11, 2014 9:32 pm
Room: Moderated Discussions
Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on August 11, 2014 12:08 pm wrote:
> anon (anon.delete@this.anon.com) on August 11, 2014 5:13 am wrote:
> > I'm talking specifically about instruction decoding. Those restrictions in group formation
> > and dispatch are due to limitations in other parts of the pipeline, and are in no
> > way analogous to instruction type restrictions in Intel's x86 decoders.
>
> On the contrary, group formation happens in the decoding stages, between early instruction decoding
> (ED stage) and the main decoding stage (Dcd stage). Check Figure 2-2 of the POWER8 user manual:
>
> POWER8 Processor User’s Manual for the Single-Chip Module
>
> Decoding of certain instructions (+4/+8 branches, move to special registers, load & store with update) which
> result in specific µop sequences have their own set of decoding limitations precisely because of this.
Okay, point taken on cracked and ucode intsructions, my above comment is wrong.
I guess they aren't such common instructions, but nowadays neither is things like store limited to the first decoder on x86.
But for the most part, ucode and cracking are not significant problems such that you would say 8 wide decoding is an understement to say the least.
Unless you are also to say that Intel can not decode 4 instructions per cycle without significant problems too. Or even 3.
> anon (anon.delete@this.anon.com) on August 11, 2014 5:13 am wrote:
> > I'm talking specifically about instruction decoding. Those restrictions in group formation
> > and dispatch are due to limitations in other parts of the pipeline, and are in no
> > way analogous to instruction type restrictions in Intel's x86 decoders.
>
> On the contrary, group formation happens in the decoding stages, between early instruction decoding
> (ED stage) and the main decoding stage (Dcd stage). Check Figure 2-2 of the POWER8 user manual:
>
> POWER8 Processor User’s Manual for the Single-Chip Module
>
> Decoding of certain instructions (+4/+8 branches, move to special registers, load & store with update) which
> result in specific µop sequences have their own set of decoding limitations precisely because of this.
Okay, point taken on cracked and ucode intsructions, my above comment is wrong.
I guess they aren't such common instructions, but nowadays neither is things like store limited to the first decoder on x86.
But for the most part, ucode and cracking are not significant problems such that you would say 8 wide decoding is an understement to say the least.
Unless you are also to say that Intel can not decode 4 instructions per cycle without significant problems too. Or even 3.