By: Maxwell (max.delete@this.a.com), January 3, 2021 12:28 pm
Room: Moderated Discussions
Ian Cutress (ian.delete@this.anandtech.com) on January 3, 2021 11:09 am wrote:
> The analogy I always like to bring up for ECC in regular use is that imagine you have a theoretical
> system that is affected by one bit error per year for every gigabyte of memory you have. 1 E/GB/yr.
>
> For a system with 128 GB, that means 128 E/GB/yr, or one soft error slightly more than
> every three days. You have to hope that error falls in memory you're not using. As systems
> get more memory, then steps need to be taken to protect from soft errors.
>
> Memory error rates are well below 1 E/GB/yr, but even then that's still a crazily low error
> rate if you think about it. In non-standard environments (high thermals, etc), the error rates
> could be that high. I take it as a rule of thumb at this point for any system build.
It's a bimodal distribution - you either have many errors (due to a defect somewhere) or basically zero. If you're on the good side of the distribution, with only extremely rare errors, then you probably don't need ECC. But without ECC, you don't know whether you need ECC!
Max
> The analogy I always like to bring up for ECC in regular use is that imagine you have a theoretical
> system that is affected by one bit error per year for every gigabyte of memory you have. 1 E/GB/yr.
>
> For a system with 128 GB, that means 128 E/GB/yr, or one soft error slightly more than
> every three days. You have to hope that error falls in memory you're not using. As systems
> get more memory, then steps need to be taken to protect from soft errors.
>
> Memory error rates are well below 1 E/GB/yr, but even then that's still a crazily low error
> rate if you think about it. In non-standard environments (high thermals, etc), the error rates
> could be that high. I take it as a rule of thumb at this point for any system build.
It's a bimodal distribution - you either have many errors (due to a defect somewhere) or basically zero. If you're on the good side of the distribution, with only extremely rare errors, then you probably don't need ECC. But without ECC, you don't know whether you need ECC!
Max