By: hobold (hobold.delete@this.vectorizer.org), September 18, 2022 6:32 pm
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on September 18, 2022 12:54 pm wrote:
> David Kanter (dkanter.delete@this.realworldtech.com) on September 18, 2022 12:29 pm wrote:
> > That's true, but sub-line ECC has a much higher area overhead.
> > So maybe I'd state it as 'power or area, take your pick'...
>
> Does it really matter? I mean, if you only modify 8 bytes out of the 64 bytes of the cache line then
> old_value XOR new_value should give you enough information to update the ECC bits of the entire cache
> line, only parity would actually appear per 8 bytes or so to not increase read latency too much.
If I am not mistaken, there used to be specific ECC codes that have a geometric interpretation. With such a code, I think, oldVal XOR newVal would effectively yield a mirror plane (well, hyperplane ... something one dimension smaller than the entire code space). And then oldCode mirrored on that plane should result in the correct newCode.
I don't remember the name of these specific codes, and if they are still in active use today. But my nebulous memory says it was this construction based on hypercubes such that any corner making up a valid code is surrounded by an edge subgraph of invalid codes, such that ... well, one can think of the neighbourhood as three rings. The innermost ring is all the single bit errors, so those are correctable. The 2nd ring is all two bit errors, so those are detectable. But the outermost 3rd ring is already made up of direct neighbours of other valid codes, so those will lead to an unrecoverable data loss.
> David Kanter (dkanter.delete@this.realworldtech.com) on September 18, 2022 12:29 pm wrote:
> > That's true, but sub-line ECC has a much higher area overhead.
> > So maybe I'd state it as 'power or area, take your pick'...
>
> Does it really matter? I mean, if you only modify 8 bytes out of the 64 bytes of the cache line then
> old_value XOR new_value should give you enough information to update the ECC bits of the entire cache
> line, only parity would actually appear per 8 bytes or so to not increase read latency too much.
If I am not mistaken, there used to be specific ECC codes that have a geometric interpretation. With such a code, I think, oldVal XOR newVal would effectively yield a mirror plane (well, hyperplane ... something one dimension smaller than the entire code space). And then oldCode mirrored on that plane should result in the correct newCode.
I don't remember the name of these specific codes, and if they are still in active use today. But my nebulous memory says it was this construction based on hypercubes such that any corner making up a valid code is surrounded by an edge subgraph of invalid codes, such that ... well, one can think of the neighbourhood as three rings. The innermost ring is all the single bit errors, so those are correctable. The 2nd ring is all two bit errors, so those are detectable. But the outermost 3rd ring is already made up of direct neighbours of other valid codes, so those will lead to an unrecoverable data loss.