Data integrity of L1 caches

By: anon2 (anon.delete@this.anon.com), September 15, 2022 7:04 pm
Room: Moderated Discussions
Everybody knows the data integrity problems with parity protected write-back arrays. ECC has also seemed to be a difficult problem for L1 data cache that seems like nobody has solved very well.

The options seem to be:
- A write-back L1D with parity and accept lack of correction.
- A write-through L1D with ECC L2.
- Expensive L1 ECC scheme.

A very long time ago I recall some CPUs had a bios selection between write-back and write-through L1, possibly integrity was the reason. More recently Intel used a "DCU 16kB mode" option in its Xeons. This changed the data cache unit from 32kB 8-way associative, to mirrored 16kB 4-way halves and ECC achieved with parity finding correct copy. This seems to have gone away in favor of an allegedly more robust L1D sram cell and they have no ECC on writeback L1.

I have no issue with this. Reliability is limited by chance of more than correctable bitflips, if 1 bitflip has very small chance then reliability can be fine. I'm no array designer but it does seem like at some point at the very high end of reliability, having ECC would be better than increasing bit reliability. But perhaps for Xeon reliability goal that is enough.

If it's good enough for Xeon, it seems likely all other "normal" CPU designs have gone this way too. Exception would be certain highly reliable or rad hard embedded, and mainframes and the like.

What is expensive about L1 ECC which is less costly in L2? Keep in mind you need write-through, so L2 has to receive all the stores. Stores could be buffered and merged along the way to the L2, but surely they could also be buffered and merged along the way to L1 in a write-back design. L1 may have a lot more misses / refills than L2, but if ECC calculation is the expensive part, then ECC bits should be shipped to L1. I wonder what is the really costly part? Or is the answer that the benefit of write-back L1 just not very large? (But that would prompt the question then why others do not do a write-through design if it does not hurt performance much)
 Next Post in Thread >
TopicPosted ByDate
Data integrity of L1 cachesanon22022/09/15 07:04 PM
  Data integrity of L1 cachesGroo2022/09/15 11:46 PM
    Data integrity of L1 cachesanon22022/09/16 09:00 AM
      Data integrity of L1 cachesgroo2022/09/16 11:06 AM
        ECC outside critical path?hobold2022/09/16 01:03 PM
          ECC outside critical path?Mr. Camel2022/09/16 03:39 PM
            ECC outside critical path?anonymou52022/09/16 05:01 PM
          ECC outside critical path?anonymou52022/09/16 04:50 PM
            ECC outside critical path?hobold2022/09/17 06:57 AM
        Data integrity of L1 cachesanon22022/09/16 05:45 PM
  Data integrity of L1 cachesanon.12022/09/16 06:51 AM
    Data integrity of L1 cachesanon22022/09/16 09:04 AM
      Data integrity of L1 cachesBrett2022/09/16 12:12 PM
  Data integrity of L1 caches---2022/09/16 11:28 AM
    Data integrity of L1 cachesdmcq2022/09/16 01:41 PM
      Data integrity of L1 caches---2022/09/16 02:42 PM
    Data integrity of L1 cachesanon22022/09/16 05:49 PM
      Data integrity of L1 caches---2022/09/16 06:25 PM
        Read the thread (NT)anon22022/09/16 06:55 PM
        Data integrity of L1 cachesanon22022/09/16 06:57 PM
    Data integrity of L1 cachesMichael S2022/09/17 05:02 PM
  Data integrity of L1 cachesDavid Kanter2022/09/16 09:44 PM
    ECC word not necessarily full cache linePaul A. Clayton2022/09/17 10:59 AM
      ECC word not necessarily full cache lineDavid Kanter2022/09/18 12:29 PM
        ECC word not necessarily full cache lineAnon2022/09/18 12:54 PM
          ECC word not necessarily full cache linehobold2022/09/18 06:32 PM
            ECC word not necessarily full cache lineMichael S2022/09/19 08:47 AM
              ECC word not necessarily full cache linehobold2022/09/20 06:38 AM
                ECC word not necessarily full cache linedmcq2022/09/21 05:10 AM
                ECC word not necessarily full cache lineMichael S2022/09/21 06:55 AM
                  ECC word not necessarily full cache linehobold2022/09/21 01:59 PM
  Data integrity of L1 cachesDavid Hess2022/09/17 10:03 AM
  Data integrity of L1 cachesMichael S2022/09/17 05:12 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊