By: anon (anon.delete@this.anon.com), April 23, 2017 10:12 pm
Room: Moderated Discussions
anon (spam.delete.delete@this.this.spam.com) on April 23, 2017 4:44 pm wrote:
> Travis (travis.downs.delete@this.gmail.com) on April 23, 2017 3:42 pm wrote:
> > anon (spam.delete.delete@this.this.spam.com) on April 23, 2017 12:47 pm wrote:
> >
> > > POWER8 was built for SMT. The issue queue is split in two halves, both can issue one
> > > group of 3 instructions + 1 branch per cycle. Both have their own register files.
> > >
> > > In ST mode the content of the PRFs is identical so you can effectively
> > > issue 6+2, but that doesn't really change the rename width.
> >
> > Isn't that just semantics, or implementation details though? In ST mode, it can rename 6 non-control ops,
> > so it is "up to" 6-wide from a software point of view with some caveats relating to the grouping (just like
> > many of the other archs have caveats related to the instruction mix and how it interacts with renaming).
>
> My point was that it's not meant to be 6 (or 8) wide. It's 3 (4) wide for 1-4 threads.
Factually false. It's 8 wide for 1 thread. Pseudo 4 wide halves for 2-8 threads. Pseudo because some instructions use or block both halves of the pipeline.
> Having more execution resources available in ST mode is nice,
> but not important for anything except marketing/licensing.
Also wrong. Single thread performance is something IBM has made no secret of working to improve. Even on parallel workloads it often remains the gating factor for scalability and for minimum SLA response times.
>
> If you think using duplicated PRFs is a viable way to implement 6/8 wide then I've got news for you.
>
> Split PRFs are viable, but straight up duplicating and forwarding everything is utterly insane.
> Travis (travis.downs.delete@this.gmail.com) on April 23, 2017 3:42 pm wrote:
> > anon (spam.delete.delete@this.this.spam.com) on April 23, 2017 12:47 pm wrote:
> >
> > > POWER8 was built for SMT. The issue queue is split in two halves, both can issue one
> > > group of 3 instructions + 1 branch per cycle. Both have their own register files.
> > >
> > > In ST mode the content of the PRFs is identical so you can effectively
> > > issue 6+2, but that doesn't really change the rename width.
> >
> > Isn't that just semantics, or implementation details though? In ST mode, it can rename 6 non-control ops,
> > so it is "up to" 6-wide from a software point of view with some caveats relating to the grouping (just like
> > many of the other archs have caveats related to the instruction mix and how it interacts with renaming).
>
> My point was that it's not meant to be 6 (or 8) wide. It's 3 (4) wide for 1-4 threads.
Factually false. It's 8 wide for 1 thread. Pseudo 4 wide halves for 2-8 threads. Pseudo because some instructions use or block both halves of the pipeline.
> Having more execution resources available in ST mode is nice,
> but not important for anything except marketing/licensing.
Also wrong. Single thread performance is something IBM has made no secret of working to improve. Even on parallel workloads it often remains the gating factor for scalability and for minimum SLA response times.
>
> If you think using duplicated PRFs is a viable way to implement 6/8 wide then I've got news for you.
>
> Split PRFs are viable, but straight up duplicating and forwarding everything is utterly insane.