CPU & Memory bit flips

By: Ganon (anon.delete@this.gmail.com), March 3, 2021 10:05 am
Room: Moderated Discussions
A recent pair of papers from facebook emphasized the importance of checksum protection even
within a single process:

Facebook’s Tectonic Filesystem:Efficiency from Exascale
https://www.usenix.org/system/files/fast21-pan.pdf

"
At Tectonic’s scale, with thousands of machines reading and writing a large amount of data every day, in-memory data corruption is a regular occurrence, a phenomenon observed in other large-scale systems [12,27]. We address this by enforcing checksum checks within and between process boundaries.
"

and

Evolution of Development Priorities in Key-value Stores Serving Large-scale Applications:The RocksDB Experience
https://www.usenix.org/system/files/fast21-dong.pdf

"
11. CPU/memory corruption does happen, though very rarely, and sometimes cannot be handled by data replication. (§5)

12.Integrity protection must cover the entire system in order to prevent corrupted data (e.g., caused by bitflips in CPU/memory) from being exposed to clients or other replicas; detecting corruption only when the data is at rest or being sent over the wire is insufficient. (§5)
"


-----
Checksums & similar protect the data but what about the code (instructions)? The total data footprint of instructions is smaller so the bitflips are less likely in practice there. Does hw take special measures to protect instructions from corruption (more than it does for data)? What sw measures make sense to protect instructions (assuming we need to care about this as well)?
 Next Post in Thread >
TopicPosted ByDate
CPU & Memory bit flipsGanon2021/03/03 10:05 AM
  Also "Silent Data Corruption"Adrian2021/03/03 11:42 AM
    Thanks for the referenceGanon2021/03/03 12:47 PM
  Implications for linux page cacheanon2021/03/03 12:54 PM
    Implications for linux page cacheLinus Torvalds2021/03/03 02:54 PM
      memory errorsblaine2021/03/03 03:53 PM
        memory errorsanon22021/03/03 06:30 PM
          memory errorsdmcq2021/03/04 06:16 AM
            memory errorsEtienne Lorrain2021/03/04 07:26 AM
              memory errorsdmcq2021/03/04 07:40 AM
                memory errorsEtienne Lorrain2021/03/04 07:58 AM
                  memory errorsdmcq2021/03/04 08:12 AM
                  memory errorsCarson2021/03/05 03:31 AM
                    memory errorsEtienne Lorrain2021/03/05 07:23 AM
                      memory errorsrwessel2021/03/05 08:48 AM
                      memory errorsdmcq2021/03/05 01:01 PM
                        memory errorsrwessel2021/03/05 01:23 PM
                          memory errorsdmcq2021/03/05 01:51 PM
                      memory errorsBrendan2021/03/06 12:38 AM
                      memory errorsCarson2021/03/06 02:35 AM
                        memory errorsCarson2021/03/06 07:24 AM
                memory errorsDavid Hess2021/03/04 02:44 PM
                  memory errorsrwessel2021/03/04 06:14 PM
                  memory errorsLinus Torvalds2021/03/04 09:21 PM
                    memory errorsanon22021/03/04 10:46 PM
                      memory errorsCarson2021/03/05 03:43 AM
                        memory errorsanon22021/03/05 08:55 AM
                    memory errorsgallier22021/03/05 03:22 AM
                  memory errorsdmcq2021/03/05 01:59 PM
                    memory errorsDavid Hess2021/03/06 05:27 AM
                      memory errorsCarson2021/03/06 07:44 AM
                      memory errorsGabriele Svelto2021/03/06 11:11 AM
                        memory errorsDavid Hess2021/03/06 11:28 AM
                          memory errorsMichael S2021/03/06 03:45 PM
              memory errorsDoug S2021/03/04 11:48 AM
                memory errorsMichael S2021/03/04 12:36 PM
              memory errorsJörn Engel2021/03/04 04:32 PM
                memory errorsLinus Torvalds2021/03/04 09:47 PM
                  memory errorsEtienne Lorrain2021/03/05 02:09 AM
                  memory errorsMichael S2021/03/05 05:06 AM
                    memory errorsLinus Torvalds2021/03/05 12:59 PM
                      memory errorsrwessel2021/03/05 01:32 PM
                        memory errorsrwessel2021/03/05 01:37 PM
                        memory errorszArchJon2021/03/06 09:39 PM
                      memory errorsGabriele Svelto2021/03/06 01:58 PM
                  memory errorsJörn Engel2021/03/05 11:12 AM
                Amiga recoverable RAM disk?Carson2021/03/05 04:03 AM
                  Thanks - TIL a cool Amiga feature (nt) (NT)John2021/03/05 01:51 PM
                    Another cool Amiga feature, datatypesCharles2021/03/06 01:01 AM
                      Another cool Amiga feature, datatypesJukka Larja2021/03/06 02:23 AM
                      Another cool Amiga feature, datatypesAnon2021/03/06 01:40 PM
                      Another cool Amiga feature, filesystemsMarcus2021/03/07 01:28 AM
  CPU & Memory bit flipszArchJon2021/03/04 07:39 AM
    CPU & Memory bit flipsdmcq2021/03/04 07:59 AM
      CPU & Memory bit flipsrwessel2021/03/04 01:27 PM
  speak of the devilRobert Williams2021/03/05 08:53 AM
    speak of the devildmcq2021/03/05 12:26 PM
      speak of the devilRobert Williams2021/03/05 04:15 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊