By: Michael S (already5chosen.delete@this.yahoo.com), January 8, 2021 2:49 am
Room: Moderated Discussions
Björn Ragnar Björnsson (bjorn.ragnar.delete@this.gmail.com) on January 7, 2021 8:26 pm wrote:
> Chester (lamchester.delete@this.gmail.com) on January 7, 2021 8:14 pm wrote:
> > > How is that going to help? Refreshing the memory will just
> > > "refresh" the flipped bits and you won't know a thing.
> >
> > No, not software refresh. I'm talking about refreshing the charge in DRAM cell capacitors.
> > Rowhammer can happen because accessing adjacent cells can cause a cell's charge to leak faster.
> > If enough of it leaks before a charge refresh, you get a
> > bit flip. Read https://arxiv.org/pdf/1904.09724.pdf
> > for more details, in the 'Rowhammer Mechanisms' section.
>
> I wasn't talking about software refresh, but well put and point taken which I presume to be that it takes less
> to flip bit when the capacitor supporting it is nearly exhausted. So, I stand corrected at least in part.
>
> > > Currently some RAM manufactures are offering LPDDR4 modules with on-board ECC and from
> > > what I've seen it is mandatory for DDR5/LPDDR5. The intent of ECC in these instances
> > > is not to increase overall memory reliability/resiliency but to enable reduced power
> > > consumption partly through lessened refresh rates. Back to square one. Sheesh!
> >
> > I think that's asking for trouble.
> >
> > By the way, DRAM manufacturers have also tried targeted
> > row refresh in hopes of mitigating rowhammer without
> > tightening global refresh timings. But researchers have figured out ways past that too. Depending on the
> > specific implementation of targeted row refresh, you can sometimes cause bit flips by hammering more rows.
>
> Asking for trouble indeed. I'm incensed how brazenly the big pictures is being lost
> at the expense of customers, users and society as a whole. Using error correction
> to reduce fundamental resiliency rather than building reliability on top of it.
>
In principle, there is nothing wrong with that.
The whole wireless (and increasingly wired) communication industry is built on similar ideas - symbol-level transmission uses very low power (and thus low signal-to-noise ratio) to achieve non-error-free communication link and then the next layer applies FEC over it. It is proven best way to approach a Shannon limit, esp. if one uses turbo-codes or LDPC.
Pretty much the same applies to NAND flashes, except that in this case optimization target is density rather than low power.
The difference vs DRAM is that in case of DRAM, because of requirements of low latency, manufacturer can't use very advanced correction codes that approach theoretical limits (I don't know a name for equivalent of Shannon limit for storage). And with relatively primitive error correction methods that are compatible with small blocks/low latency, it's less obvious that there is really a gain.
> Chester (lamchester.delete@this.gmail.com) on January 7, 2021 8:14 pm wrote:
> > > How is that going to help? Refreshing the memory will just
> > > "refresh" the flipped bits and you won't know a thing.
> >
> > No, not software refresh. I'm talking about refreshing the charge in DRAM cell capacitors.
> > Rowhammer can happen because accessing adjacent cells can cause a cell's charge to leak faster.
> > If enough of it leaks before a charge refresh, you get a
> > bit flip. Read https://arxiv.org/pdf/1904.09724.pdf
> > for more details, in the 'Rowhammer Mechanisms' section.
>
> I wasn't talking about software refresh, but well put and point taken which I presume to be that it takes less
> to flip bit when the capacitor supporting it is nearly exhausted. So, I stand corrected at least in part.
>
> > > Currently some RAM manufactures are offering LPDDR4 modules with on-board ECC and from
> > > what I've seen it is mandatory for DDR5/LPDDR5. The intent of ECC in these instances
> > > is not to increase overall memory reliability/resiliency but to enable reduced power
> > > consumption partly through lessened refresh rates. Back to square one. Sheesh!
> >
> > I think that's asking for trouble.
> >
> > By the way, DRAM manufacturers have also tried targeted
> > row refresh in hopes of mitigating rowhammer without
> > tightening global refresh timings. But researchers have figured out ways past that too. Depending on the
> > specific implementation of targeted row refresh, you can sometimes cause bit flips by hammering more rows.
>
> Asking for trouble indeed. I'm incensed how brazenly the big pictures is being lost
> at the expense of customers, users and society as a whole. Using error correction
> to reduce fundamental resiliency rather than building reliability on top of it.
>
In principle, there is nothing wrong with that.
The whole wireless (and increasingly wired) communication industry is built on similar ideas - symbol-level transmission uses very low power (and thus low signal-to-noise ratio) to achieve non-error-free communication link and then the next layer applies FEC over it. It is proven best way to approach a Shannon limit, esp. if one uses turbo-codes or LDPC.
Pretty much the same applies to NAND flashes, except that in this case optimization target is density rather than low power.
The difference vs DRAM is that in case of DRAM, because of requirements of low latency, manufacturer can't use very advanced correction codes that approach theoretical limits (I don't know a name for equivalent of Shannon limit for storage). And with relatively primitive error correction methods that are compatible with small blocks/low latency, it's less obvious that there is really a gain.