Also "Silent Data Corruption"

By: Adrian (a.delete@this.acm.org), March 3, 2021 10:42 am
Room: Moderated Discussions
Ganon (anon.delete@this.gmail.com) on March 3, 2021 9:05 am wrote:
> A recent pair of papers from facebook emphasized the importance of checksum protection even
> within a single process:
>
> Facebook’s Tectonic Filesystem:Efficiency from Exascale
> https://www.usenix.org/system/files/fast21-pan.pdf
>
> "
> At Tectonic’s scale, with thousands of machines reading and writing a large amount of data every day,
> in-memory data corruption is a regular occurrence, a phenomenon observed in other large-scale systems
> [12,27]. We address this by enforcing checksum checks within and between process boundaries.
> "
>
> and
>
> Evolution of Development Priorities in Key-value Stores
> Serving Large-scale Applications:The RocksDB Experience
> https://www.usenix.org/system/files/fast21-dong.pdf
>
> "
> 11. CPU/memory corruption does happen, though very rarely,
> and sometimes cannot be handled by data replication. (§5)
>
> 12.Integrity protection must cover the entire system in order to prevent corrupted data (e.g.,
> caused by bitflips in CPU/memory) from being exposed to clients or other replicas; detecting
> corruption only when the data is at rest or being sent over the wire is insufficient. (§5)
> "
>
>
> -----
> Checksums & similar protect the data but what about the code (instructions)? The total data footprint
> of instructions is smaller so the bitflips are less likely in practice there. Does hw take special
> measures to protect instructions from corruption (more than it does for data)? What sw measures
> make sense to protect instructions (assuming we need to care about this as well)?


Another interesting paper just published by Facebook:

https://arxiv.org/abs/2102.11245

"Silent Data Corruptions at Scale".






< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
CPU & Memory bit flipsGanon2021/03/03 09:05 AM
  Also "Silent Data Corruption"Adrian2021/03/03 10:42 AM
    Thanks for the referenceGanon2021/03/03 11:47 AM
  Implications for linux page cacheanon2021/03/03 11:54 AM
    Implications for linux page cacheLinus Torvalds2021/03/03 01:54 PM
      memory errorsblaine2021/03/03 02:53 PM
        memory errorsanon22021/03/03 05:30 PM
          memory errorsdmcq2021/03/04 05:16 AM
            memory errorsEtienne Lorrain2021/03/04 06:26 AM
              memory errorsdmcq2021/03/04 06:40 AM
                memory errorsEtienne Lorrain2021/03/04 06:58 AM
                  memory errorsdmcq2021/03/04 07:12 AM
                  memory errorsCarson2021/03/05 02:31 AM
                    memory errorsEtienne Lorrain2021/03/05 06:23 AM
                      memory errorsrwessel2021/03/05 07:48 AM
                      memory errorsdmcq2021/03/05 12:01 PM
                        memory errorsrwessel2021/03/05 12:23 PM
                          memory errorsdmcq2021/03/05 12:51 PM
                      memory errorsBrendan2021/03/05 11:38 PM
                      memory errorsCarson2021/03/06 01:35 AM
                        memory errorsCarson2021/03/06 06:24 AM
                memory errorsDavid Hess2021/03/04 01:44 PM
                  memory errorsrwessel2021/03/04 05:14 PM
                  memory errorsLinus Torvalds2021/03/04 08:21 PM
                    memory errorsanon22021/03/04 09:46 PM
                      memory errorsCarson2021/03/05 02:43 AM
                        memory errorsanon22021/03/05 07:55 AM
                    memory errorsgallier22021/03/05 02:22 AM
                  memory errorsdmcq2021/03/05 12:59 PM
                    memory errorsDavid Hess2021/03/06 04:27 AM
                      memory errorsCarson2021/03/06 06:44 AM
                      memory errorsGabriele Svelto2021/03/06 10:11 AM
                        memory errorsDavid Hess2021/03/06 10:28 AM
                          memory errorsMichael S2021/03/06 02:45 PM
              memory errorsDoug S2021/03/04 10:48 AM
                memory errorsMichael S2021/03/04 11:36 AM
              memory errorsJörn Engel2021/03/04 03:32 PM
                memory errorsLinus Torvalds2021/03/04 08:47 PM
                  memory errorsEtienne Lorrain2021/03/05 01:09 AM
                  memory errorsMichael S2021/03/05 04:06 AM
                    memory errorsLinus Torvalds2021/03/05 11:59 AM
                      memory errorsrwessel2021/03/05 12:32 PM
                        memory errorsrwessel2021/03/05 12:37 PM
                        memory errorszArchJon2021/03/06 08:39 PM
                      memory errorsGabriele Svelto2021/03/06 12:58 PM
                  memory errorsJörn Engel2021/03/05 10:12 AM
                Amiga recoverable RAM disk?Carson2021/03/05 03:03 AM
                  Thanks - TIL a cool Amiga feature (nt) (NT)John2021/03/05 12:51 PM
                    Another cool Amiga feature, datatypesCharles2021/03/06 12:01 AM
                      Another cool Amiga feature, datatypesJukka Larja2021/03/06 01:23 AM
                      Another cool Amiga feature, datatypesAnon2021/03/06 12:40 PM
                      Another cool Amiga feature, filesystemsMarcus2021/03/07 12:28 AM
  CPU & Memory bit flipszArchJon2021/03/04 06:39 AM
    CPU & Memory bit flipsdmcq2021/03/04 06:59 AM
      CPU & Memory bit flipsrwessel2021/03/04 12:27 PM
  speak of the devilRobert Williams2021/03/05 07:53 AM
    speak of the devildmcq2021/03/05 11:26 AM
      speak of the devilRobert Williams2021/03/05 03:15 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?