Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512

By: Ivan (xxx.delete@this.xxx.xxx), September 1, 2022 2:21 am
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on August 31, 2022 10:30 am wrote:
> If in Zen 4 they have doubled the width of the connection to the L1 data cache, in order to match
> the AVX-512 LD/ST throughput of all Intel CPUs, then that automatically enables also the increase
> of the AVX throughput to three 256-bit LD/ST per cycle, and up to 2 of them can be stores.
>
> The associated increased AVX throughput explains the IPC increase in the legacy benchmarks.
>
>
> Until AMD presents the Zen 4 microarchitecture, we cannot know for sure, but designing Zen
> 4 to be inferior to the competition is not something that I can believe to have happened.
>
> The improved IPC from the presentation explained by improvement of loads and stores cannot mean
> anything else but a wider connection to the L1 data cache, which was a bottleneck in Zen 3.
> It cannot mean more LD/ST port as those already existing in Zen 3 cannot be fully used.
>
> To achieve an improvement in AVX over Zen 3, the cache link for loads must be
> increased from 512 bit per cycle to 768 bit per cycle, while the cache link
> for stores must be increased from 256 bit per cycle to 512 bit per cycle.
>
> Once the cache link is widened that much, it would be extremely stupid to not widen
> a little more the link for loads, up to 1024 bit per cycle, to match the performance
> of the Intel CPUs and to provide balanced LD/ST bandwidth for AVX-512.
>

https://twitter.com/yuuki_ans/status/1549256374936170497

If this AMD Genoa benchmark leak is legit, then Zen4's L1D port width has been doubled.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Zen 4, AVX-512 support, 2 cycle execution timeanonymous22022/08/29 05:08 PM
  Zen 4, AVX-512 support, 2 cycle execution timeFreddie2022/08/29 05:32 PM
    Zen 4, AVX-512 support, 2 cycle execution timenoko2022/08/29 11:54 PM
    Zen 4, AVX-512 support, 2 cycle execution timeIvan2022/08/30 12:00 AM
  HPC code is moving to GPUs ...Mark Roulo2022/08/29 06:26 PM
    HPC code is moving to GPUs ...Adrian2022/08/30 01:12 AM
      HPC code is moving to GPUs ...me2022/08/30 08:17 AM
        HPC code is moving to GPUs ...Adrian2022/08/30 10:23 AM
          HPC code is moving to GPUs ...me2022/08/30 12:06 PM
            HPC code is moving to GPUs ...Anon2022/08/30 12:34 PM
              HPC code is moving to GPUs ...me2022/08/30 04:23 PM
          HPC code is moving to GPUs ...Björn Ragnar Björnsson2022/08/30 01:17 PM
  Zen 4, AVX-512 support, 2 cycle execution timeAdrian2022/08/30 12:45 AM
    Zen 4, AVX-512 support, 2 cycle execution timeMarcus2022/08/30 10:34 AM
  Zen 4 LD/ST enhancementsAdrian2022/08/31 01:25 AM
    Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Heikki Kultala2022/08/31 07:38 AM
      Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Marcus2022/08/31 08:55 AM
        Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Adrian2022/08/31 10:30 AM
          Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Ivan2022/09/01 02:21 AM
            The result is for 2-socket system, not single processorHeikki Kultala2022/09/01 08:31 AM
      Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Adrian2022/08/31 10:10 AM
        Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Anon2022/08/31 02:24 PM
          Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512noko2022/08/31 03:21 PM
            Zen 4 LD/ST enhancements that contribute to the IPC imprvement have nothing to do with AVX-512Adrian2022/08/31 11:58 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊