By: Jeffrey Bosboom (firstinitiallastname.delete@this.firstnamelastname.com), November 4, 2022 11:05 pm
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on November 4, 2022 10:53 pm wrote:
> Jeffrey Bosboom (firstinitiallastname.delete@this.firstnamelastname.com) on November 4, 2022 10:37 pm wrote:
> > - Why would a designer choose 2 over 1?
>
> Did any designer have done that yet?
From a Chips and Cheese article on Zen 4:
below a table showing 0.94 IPC for Tiger Lake on "2:1 Mixed 256-bit and 512-bit FMA".
> Jeffrey Bosboom (firstinitiallastname.delete@this.firstnamelastname.com) on November 4, 2022 10:37 pm wrote:
> > - Why would a designer choose 2 over 1?
>
> Did any designer have done that yet?
From a Chips and Cheese article on Zen 4:
While Intel’s client architectures have comparable vector throughput to Zen 4, 512-bit operations through 256-bit pipes are handled differently. Intel fuses two 256-bit units across ports 0 and 1 to handle a 512-bit operation. This leads to some interesting characteristics when mixing 256-bit FMA instructions with 512-bit ones. Intel is stuck at one vector operation per cycle, likely because 256-bit FMA units on ports 0 and 1 have to be set to 1×512-bit or 2×256-bit mode, but cannot be in both modes at once.
below a table showing 0.94 IPC for Tiger Lake on "2:1 Mixed 256-bit and 512-bit FMA".