What happens when DRAM has more bandwidth than Layer 3 cache?

By: blaine (myname.delete@this.acm.org), December 8, 2022 5:07 pm
Room: Moderated Discussions
--- (---.delete@this.redheron.com) on December 8, 2022 10:44 am wrote:
> Simon Farnsworth (simon.delete@this.farnz.org.uk) on December 8, 2022 7:22 am wrote:
> > Etienne (etienne_lorrain.delete@this.yahoo.fr) on December 8, 2022 6:20 am wrote:
> > > Looks like my AMD Ryzen 9 7950x has a L3 cache bandwidth of 63.9 GB/s, my current DRAM DDR5
> > > has either 49.6 GB/s (Jedec) or 52.5 GB/s (AMD Expo) measured by memtest86 UEFI.
> > > It seems some companies are increasing DRAM bandwidth: 8Gbps DDR5.
> > >
> > > I assume latency to L3 cache is still probably better than latency to
> > > DRAM, but in simple terms, do we still need L3 cache in processors?
> >
> > A critical difference between peak DRAM throughput, and L3 throughput, is that L3 throughput is independent
> > of access pattern (as long as you never leave L3, of course) - you get the same throughput from L3 whether
> > you read cachelines sequentially, or whether you read cachelines in a random order, and you get the same
> > throughput when writing whether you write sequentially, or whether you write in a random order. There's
> > also no penalty for mixing writes and reads - the timings are the same for read then read another line,
> > as for read then write and for write then read and for write then write another line.
> >
> > DDR5 doesn't offer that - your throughput is lower if you read or write 64 byte chunks at
> > random throughput the chip than if you arrange to stay in the same bank group as much as
> > possible. There's also a small penalty for mixing reads and writes, so you benefit from L3
> > if it lets you do more writes in sequence before switching back to reads or vice-versa.
>
> Not really. This may be the case with lousy designs but
> - L3 will be banked so (in theory) a worst-case pattern will hammer a single bank...
> BUT in reality this is not an issue because the addresses will be hashed before being distributed over banks
> - then, a GOOD memory controller will do the same thing, hashing addresses so that, as much
> as possible, it's hard to construct a realistic access pattern that hammers a single bank
> of DRAM rather than spreading maximally over all the available banks and ranks.
>
> Mixing reads and writes, likewise, is design dependent. The traditional algorithms like open-page first
> and FR-FCFS are (like their cache counterparts) naive tools from the days when transistors were expensive;
> if you're willing to burn transistors you can do far far better. My M1 PDFs (I think it's volume 3) describe
> the evolution of the Apple Memory controller, which uses a very clever three-level scheme (plus the willingness
> to have fairly large queues in the controller) to sort requests to optimize for both hitting open pages
> and read/write turnaround WHILE STILL maintaining QoS for various clients.
> In my testing I did not see notable DRAM M1 bandwidth falloff for any patterns
> I tried, from full read to various mixed read+write to full write. (Of course
> having the SLC as a huge memory-side data buffer also helps...)

On the HP MX-2, a dual cpu MCM with two CPU chips, each with Itanium Buses connected to a very large (32MB) L3 cache chip, and then a single Itanium bus connection to the rest of the system. This was HPs emergency response to the first dual core IBM power chip. The L3 tags had enough BW for any workload, but the sram chips only had enough BW for commercial workloads. If the Sram queue became too long, clean reads would be treated as misses until the queue length diminished.

More details are here: US8683139B2
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
What happens when DRAM has more bandwidth than Layer 3 cache?Etienne2022/12/08 06:20 AM
  What happens when DRAM has more bandwidth than Layer 3 cache?Simon Farnsworth2022/12/08 07:22 AM
    bandwidth*delay productMichael S2022/12/08 08:06 AM
    What happens when DRAM has more bandwidth than Layer 3 cache?---2022/12/08 10:44 AM
      What happens when DRAM has more bandwidth than Layer 3 cache?blaine2022/12/08 05:07 PM
  What happens when DRAM has more bandwidth than Layer 3 cache?Michael S2022/12/08 07:32 AM
    What happens when DRAM has more bandwidth than Layer 3 cache?Etienne2022/12/08 08:05 AM
      What happens when DRAM has more bandwidth than Layer 3 cache?Michael S2022/12/08 08:13 AM
        What happens when DRAM has more bandwidth than Layer 3 cache?Etienne2022/12/08 01:56 PM
  What happens when DRAM has more bandwidth than Layer 3 cache?Peter E. fry2022/12/08 08:20 AM
  Programs do not see bandwidth. Programs only see latency. Heikki Kultala2022/12/08 08:26 AM
    Programs do not see bandwidth. Programs only see latency. Chester2022/12/08 11:07 AM
  What happens when DRAM has more bandwidth than Layer 3 cache?Doug S2022/12/08 09:31 AM
  What happens when DRAM has more bandwidth than Layer 3 cache?---2022/12/08 10:32 AM
    What happens when DRAM has more bandwidth than Layer 3 cache?Michael S2022/12/08 02:42 PM
      What happens when DRAM has more bandwidth than Layer 3 cache?---2022/12/08 03:54 PM
        What happens when DRAM has more bandwidth than Layer 3 cache?Anon2022/12/20 06:46 AM
  What happens when DRAM has more bandwidth than Layer 3 cache?Andrey2022/12/08 03:10 PM
    What happens when DRAM has more bandwidth than Layer 3 cache?Etienne2022/12/14 03:20 AM
  What happens when DRAM has more bandwidth than Layer 3 cache?Gionatan Danti2022/12/09 12:31 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? ūüćä