Data integrity of L1 caches

By: anon.1 (abc.delete@this.def.com), September 16, 2022 5:51 am
Room: Moderated Discussions
anon2 (anon.delete@this.anon.com) on September 15, 2022 7:04 pm wrote:
> Everybody knows the data integrity problems with parity protected write-back arrays. ECC has also
> seemed to be a difficult problem for L1 data cache that seems like nobody has solved very well.
>
> The options seem to be:
> - A write-back L1D with parity and accept lack of correction.
> - A write-through L1D with ECC L2.
> - Expensive L1 ECC scheme.
>
> A very long time ago I recall some CPUs had a bios selection between write-back and write-through
> L1, possibly integrity was the reason. More recently Intel used a "DCU 16kB mode" option in
> its Xeons. This changed the data cache unit from 32kB 8-way associative, to mirrored 16kB 4-way
> halves and ECC achieved with parity finding correct copy. This seems to have gone away in favor
> of an allegedly more robust L1D sram cell and they have no ECC on writeback L1.
>
> I have no issue with this. Reliability is limited by chance of more than correctable bitflips,
> if 1 bitflip has very small chance then reliability can be fine. I'm no array designer but it
> does seem like at some point at the very high end of reliability, having ECC would be better
> than increasing bit reliability. But perhaps for Xeon reliability goal that is enough.
>
> If it's good enough for Xeon, it seems likely all other "normal" CPU designs have gone this way too.
> Exception would be certain highly reliable or rad hard embedded, and mainframes and the like.
>
> What is expensive about L1 ECC which is less costly in L2? Keep in mind you need write-through, so L2 has to
> receive all the stores. Stores could be buffered and merged along the way to the L2, but surely they could also
> be buffered and merged along the way to L1 in a write-back design. L1 may have a lot more misses / refills than
> L2, but if ECC calculation is the expensive part, then ECC bits should be shipped to L1. I wonder what is the
> really costly part? Or is the answer that the benefit of write-back L1 just not very large? (But that would prompt
> the question then why others do not do a write-through design if it does not hurt performance much)

AMD claims to have ECC on L1D cache. I didn't look for Xeon assuming you did check their documentation before stating what you did.

"The AMD Family 17h processor contains a 32-Kbyte, 8-way set associative L1 data cache. This is a write-back cache that supports two 128-bit loads and one 128-bit store per cycle. In addition, the L1 cache is protected from bit errors through the use of ECC. There is a hardware prefetcher that brings data into the L1 data cache to avoid misses. "

https://developer.amd.com/wordpress/media/2013/12/55723_SOG_Fam_17h_Processors_3.00.pdf
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Data integrity of L1 cachesanon22022/09/15 06:04 PM
  Data integrity of L1 cachesGroo2022/09/15 10:46 PM
    Data integrity of L1 cachesanon22022/09/16 08:00 AM
      Data integrity of L1 cachesgroo2022/09/16 10:06 AM
        ECC outside critical path?hobold2022/09/16 12:03 PM
          ECC outside critical path?Mr. Camel2022/09/16 02:39 PM
            ECC outside critical path?anonymou52022/09/16 04:01 PM
          ECC outside critical path?anonymou52022/09/16 03:50 PM
            ECC outside critical path?hobold2022/09/17 05:57 AM
        Data integrity of L1 cachesanon22022/09/16 04:45 PM
  Data integrity of L1 cachesanon.12022/09/16 05:51 AM
    Data integrity of L1 cachesanon22022/09/16 08:04 AM
      Data integrity of L1 cachesBrett2022/09/16 11:12 AM
  Data integrity of L1 caches---2022/09/16 10:28 AM
    Data integrity of L1 cachesdmcq2022/09/16 12:41 PM
      Data integrity of L1 caches---2022/09/16 01:42 PM
    Data integrity of L1 cachesanon22022/09/16 04:49 PM
      Data integrity of L1 caches---2022/09/16 05:25 PM
        Read the thread (NT)anon22022/09/16 05:55 PM
        Data integrity of L1 cachesanon22022/09/16 05:57 PM
    Data integrity of L1 cachesMichael S2022/09/17 04:02 PM
  Data integrity of L1 cachesDavid Kanter2022/09/16 08:44 PM
    ECC word not necessarily full cache linePaul A. Clayton2022/09/17 09:59 AM
      ECC word not necessarily full cache lineDavid Kanter2022/09/18 11:29 AM
        ECC word not necessarily full cache lineAnon2022/09/18 11:54 AM
          ECC word not necessarily full cache linehobold2022/09/18 05:32 PM
            ECC word not necessarily full cache lineMichael S2022/09/19 07:47 AM
              ECC word not necessarily full cache linehobold2022/09/20 05:38 AM
                ECC word not necessarily full cache linedmcq2022/09/21 04:10 AM
                ECC word not necessarily full cache lineMichael S2022/09/21 05:55 AM
                  ECC word not necessarily full cache linehobold2022/09/21 12:59 PM
  Data integrity of L1 cachesDavid Hess2022/09/17 09:03 AM
  Data integrity of L1 cachesMichael S2022/09/17 04:12 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊