By: Maynard Handley (name99.delete@this.name99.org), November 1, 2015 6:39 pm
Room: Moderated Discussions
Heikki Kultala (hkultala.delete@this.iki.fi) on November 1, 2015 9:57 am wrote:
>
> > As said above maybe it is here for "symmetry reasons". Zen design is highly symmetric with each
> > execution unit coming in pairs.
>
> Not so highly symmetric. There is only one integer multiplier, and only one integer divider.
>
> And there is quite a lot of asymmetricity on the FP pipes, there are not two similar add pipes and two
> similar mul pipes, almost all other fp/vector instructions go quite differently to the pipelines.
>
> > I suspect that AMD is not enabling full SMT on Zen, but a partial
> > CSMT implementation. If this is true Zen is a simple clustered design and the fourth ALU is here
> > because both clusters have to be the identical for simplifying thread selection and scheduling.
>
> :D
>
> Does not work that way due only one multiplier and one divider.
>
Cyclone appears to be a clustered design, with some asymmetries (in particular integer divide and multiply). The followup designs removed the multiply asymmetry, but not, as far as I know, the divide. (And I suspect even apart from those visible cases, only one of the clusters handles the rare "hassle" instructions that the OS uses like privilege changes or TLB management.)
Point is clustering does not require IDENTICAL clusters; the steering algorithm can obviously force certain instructions to one cluster and, after a possible one-time stutter (probably a single cycle) dependent instructions should once again mostly flow to the right cluster.
>
> > As said above maybe it is here for "symmetry reasons". Zen design is highly symmetric with each
> > execution unit coming in pairs.
>
> Not so highly symmetric. There is only one integer multiplier, and only one integer divider.
>
> And there is quite a lot of asymmetricity on the FP pipes, there are not two similar add pipes and two
> similar mul pipes, almost all other fp/vector instructions go quite differently to the pipelines.
>
> > I suspect that AMD is not enabling full SMT on Zen, but a partial
> > CSMT implementation. If this is true Zen is a simple clustered design and the fourth ALU is here
> > because both clusters have to be the identical for simplifying thread selection and scheduling.
>
> :D
>
> Does not work that way due only one multiplier and one divider.
>
Cyclone appears to be a clustered design, with some asymmetries (in particular integer divide and multiply). The followup designs removed the multiply asymmetry, but not, as far as I know, the divide. (And I suspect even apart from those visible cases, only one of the clusters handles the rare "hassle" instructions that the OS uses like privilege changes or TLB management.)
Point is clustering does not require IDENTICAL clusters; the steering algorithm can obviously force certain instructions to one cluster and, after a possible one-time stutter (probably a single cycle) dependent instructions should once again mostly flow to the right cluster.