By: Maynard Handley (, September 21, 2020 12:05 pm
This weekend I read
which led me down the rabbithole of state of the art practical I-prefetching, and damn, this stuff is fascinating!

I recommend following the citations backwards to start at the Confluence paper,
read the Boomerang paper (the biggest single idea IMHO),
the Shotgun paper (refining Boomerang),
then the above paper (fixing an obvious flaw in Shotgun -- so obvious I have to wonder if there's a communications mismatch, that the Shotgun guys didn't include that detail because, duh, OBVIOUSLY you'd also do that; their other contribution is to adapt the idea to variable length ISA which is obviously of little interest to me, but probably of interest to most of you lot).

There is also an arxiv followup to the Shotgun paper describing some post publication tweaks to substantially shrink the (already tiny) area footprint.

This stuff is interesting because
- it seems to me very practical. (As in I could see it being in an Apple or ARM core maybe as soon as next year.)
- you can see the evolution in real time, the initial good idea, then the subsequent refinements improving one aspect or another
