By: Michael S (already5chosen.delete@this.yahoo.com), September 21, 2021 10:23 am
Room: Moderated Discussions
-.- (blarg.delete@this.mailinator.com) on September 20, 2021 4:44 pm wrote:
> dmcq (dmcq.delete@this.fano.co.uk) on September 20, 2021 2:19 am wrote:
> > It'd be interesting to see how Supercomputer Fugaku fares
> > thst way as they said they were doing special work
> > accessing two cache lines at once to deal with their 512 bit reads and writes being split in two like that.
>
> Side note, but worth pointing out that SVE allows vectors to have non power-of-2 widths.
> I'm not sure what ARM's intentions are with alignment on a machine with say, 384-bit vectors,
> but it would seem like the best you could hope for is that a portion of accesses align.
>
> For Fugaku, coders probably specifically target the uArch, so not likely an issue,
> but if you're following SVE's "write once, run on any vector width" mantra, it
> seems like attaining alignment might be more trouble than it's worth.
I would think that the mantra is for coders that are satisfied with 40-60% utilization of the HW.
Those that want 70-80% utilization would have to target actual width anyway.
Or use higher-level frameworks and let their code generators to target actual width.
> dmcq (dmcq.delete@this.fano.co.uk) on September 20, 2021 2:19 am wrote:
> > It'd be interesting to see how Supercomputer Fugaku fares
> > thst way as they said they were doing special work
> > accessing two cache lines at once to deal with their 512 bit reads and writes being split in two like that.
>
> Side note, but worth pointing out that SVE allows vectors to have non power-of-2 widths.
> I'm not sure what ARM's intentions are with alignment on a machine with say, 384-bit vectors,
> but it would seem like the best you could hope for is that a portion of accesses align.
>
> For Fugaku, coders probably specifically target the uArch, so not likely an issue,
> but if you're following SVE's "write once, run on any vector width" mantra, it
> seems like attaining alignment might be more trouble than it's worth.
I would think that the mantra is for coders that are satisfied with 40-60% utilization of the HW.
Those that want 70-80% utilization would have to target actual width anyway.
Or use higher-level frameworks and let their code generators to target actual width.