By: Adrian (a.delete@this.acm.org), November 5, 2022 3:50 am
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on November 5, 2022 3:27 am wrote:
> Adrian (a.delete@this.acm.org) on November 5, 2022 3:00 am wrote:
> > Zen 4 could also use the method described by Intel, by coupling the four 256-bit pipelines into
> > two 512-bit pipelines, onto which the 512-bit operations are scheduled, with the restrictions
> > that a few operations, e.g. FMA and MUL, can be executed in only one of the pipelines.
>
> How they would do that with non-unified scheduller?
>
Exactly like they have done it in their previous Zen 3 scheduler, where some operations can be done in any of the four 256-bit pipelines, while some operations could be done in anyone of only two of the pipelines.
If the four pipelines are coupled into two pipeline pairs for scheduling 512-bit operations, the scheduling is similar to the scheduling of the 256-bit operations, but simpler.
It can be done by the same circuits, e.g. by forcing the inputs that show the busy state of the 2nd pipeline in a pair to the busy value, because scheduling the 512-bit operations (when the splitting is done after scheduling) is like always choosing the 1st pipeline in a pair of 256-bit pipelines.
> Adrian (a.delete@this.acm.org) on November 5, 2022 3:00 am wrote:
> > Zen 4 could also use the method described by Intel, by coupling the four 256-bit pipelines into
> > two 512-bit pipelines, onto which the 512-bit operations are scheduled, with the restrictions
> > that a few operations, e.g. FMA and MUL, can be executed in only one of the pipelines.
>
> How they would do that with non-unified scheduller?
>
Exactly like they have done it in their previous Zen 3 scheduler, where some operations can be done in any of the four 256-bit pipelines, while some operations could be done in anyone of only two of the pipelines.
If the four pipelines are coupled into two pipeline pairs for scheduling 512-bit operations, the scheduling is similar to the scheduling of the 256-bit operations, but simpler.
It can be done by the same circuits, e.g. by forcing the inputs that show the busy state of the 2nd pipeline in a pair to the busy value, because scheduling the 512-bit operations (when the splitting is done after scheduling) is like always choosing the 1st pipeline in a pair of 256-bit pipelines.