By: Chester (lamchester.delete@this.gmail.com), January 8, 2021 2:14 pm
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on January 8, 2021 9:13 am wrote:
> > Most L1I were protected with parity, unlike L1D which were protected with ECC.
> >
> > It is true that looking right now in the datasheet of Cascade Lake, I could
> > no longer see any information about how the caches are protected.
> >
> > Nevertheless, it is hard for me to believe that they have stopped using
> > any kind of EDC/ECC in L1, like they were using in the past.
> >
> > It seems more likely that they are no longer documenting what they are using.
>
> They don't have ECC on L1. The cache is implemented in RF cells (so 8T bit cells),
> which are much less susceptible to bit flips because they store more charge and isolate
> R/W. Additionally, FinFETs dramatically reduce the susceptibility to SERs.
>
> David
Yeah, looks like I'm wrong about modern CPUs having ECC on L1D. I took a look around and:
- No clear info on Zen's L1D
- Bulldozer's write coalescing cache is ECC protected. L1D/tags are only parity protected, I guess because it's write through to the WCC.
- Bobcat/Jaguar's L1D/tags are parity protected (16h bkdg, page 505 and 14h bkdg, page 403)
- K10's L1D is ECC protected. Tags are parity protected (10h bkdg)
So looks like newer L1D caches are resilient enough to not need ECC.
More curiosity - going through the Zen 2 PPR's machine check section, most things are parity protected instead of using ECC.
Parity protection:
ECC protection:
Ambiguous (PPR says "ECC or parity error"):
No clue: Data cache/tags - errors are just listed as type 1,2,3,4,5,6
> > Most L1I were protected with parity, unlike L1D which were protected with ECC.
> >
> > It is true that looking right now in the datasheet of Cascade Lake, I could
> > no longer see any information about how the caches are protected.
> >
> > Nevertheless, it is hard for me to believe that they have stopped using
> > any kind of EDC/ECC in L1, like they were using in the past.
> >
> > It seems more likely that they are no longer documenting what they are using.
>
> They don't have ECC on L1. The cache is implemented in RF cells (so 8T bit cells),
> which are much less susceptible to bit flips because they store more charge and isolate
> R/W. Additionally, FinFETs dramatically reduce the susceptibility to SERs.
>
> David
Yeah, looks like I'm wrong about modern CPUs having ECC on L1D. I took a look around and:
- No clear info on Zen's L1D
- Bulldozer's write coalescing cache is ECC protected. L1D/tags are only parity protected, I guess because it's write through to the WCC.
- Bobcat/Jaguar's L1D/tags are parity protected (16h bkdg, page 505 and 14h bkdg, page 403)
- K10's L1D is ECC protected. Tags are parity protected (10h bkdg)
So looks like newer L1D caches are resilient enough to not need ECC.
More curiosity - going through the Zen 2 PPR's machine check section, most things are parity protected instead of using ECC.
Parity protection:
- Branch prediction queues
- L1i/tags
- Decoupling queue (?)
- Fetch address FIFO
- Instruction buffer queue
- Patch RAM sequencer (microcode) and data
- Op cache microtags, tags, and data
- Op queue
- Instruction dispatch queue
- iTLBs and dTLBs, all levels. PSP TLB is also parity protected
- Miss address buffer payload
- Page directory cache
- OOO queues: Load and store queues, scheduling queue, branch buffer, retire dispatch queue (ROB?), checkpoint queue, retire status queue
- INT cluster: Physical register file, flags register file, immediate displacement register file, EX payload, address generator payload
- FPU: physical register file, status register file, retire queue (why's this listed under FP?), non-scheduling queue, scheduling queue, freelist
- L3 victim queue, SDP (scalable data plane interface?)
- Memory controller address/command
- PSP dirty data RAM
- PSP data cache/instruction cache tags
ECC protection:
- L2 data array and tags. State array too
- L3 data array and tags
- L3 shadow tags
- Probe filter
- Memory controller AES SRAM (for hw mem encryption?), DCQ SRAM (decoupling queue?)
- Parameter block (?)
Ambiguous (PPR says "ECC or parity error"):
- PSP system hub read buffer, data cache, instruction cache, low SRAM, and high SRAM
- All listed SMU structures: system hub read buffer, instruction cache/tags, data cache/tags, high/low SRAM
- MP5 (yet another embedded microcontroller?) structures: instruction and data cache and tags, high/low SRAM
No clue: Data cache/tags - errors are just listed as type 1,2,3,4,5,6