By: --- (, December 26, 2021 10:07 pm
Up till I think, Ivy Bridge, Intel used a traditional cache banking model, with banks 8B wide, so 8 banks covering a cache line. The claim is that this has changed with Haswell and successors, and that bank conflicts are reduced/eliminated.

(a) Is this known/understood in theory?

(b) Has this been probed experimentally by the usual suspects (like Travis)? With what results?

(c) Consider the following model for "virtual" banking. Does it seem plausible or impractical/insane?

- Each line is broken up into segments (16B or 8B say)
- These are stacked vertically above a sense-amp as a traditional bank, 64 lines or so stacked per sense amp.

- BUT the address mapping from a particular line ID to a particular bank is swizzled in some fashion (eg physically permute the address bits and/or xor them with a random constant) so that it's more or less random which cache line segment is mapped to which physical sense amp.

- Different sense amps are connected to different "in/out" paths from the cache to the core.

End result is that you have 4 or 8 "virtual" banks in the cache, a virtual bank being a set of cache line segments that share a sense amp and ultimately a particular set of wires out of the cache. But there's no obvious mapping from a particular address to a particular virtual bank.
Such a scheme will not be quite as good as traditional banking for absolute sequential access, but will be a lot better for the sorts of low-stride (64B, 128B, ...) accesses that gave pre-Haswell caches such trouble.
