Modern cores

By: Maynard Handley (, July 23, 2020 9:06 am
Room: Moderated Discussions
Chester ( on July 22, 2020 11:59 pm wrote:
> > > > Presumably pretty much everything Samsung describes, Apple does
> Maynard - what's the basis for this assumption?

You're taking what I said way too literally.
The point is that what Samsung describes must set essentially a floor to the quality of implementation of a modern CPU, given that Samsung's performance was essentially the lowest out there.
The reason I drew attention to this is that most discussion of CPUs is still stuck in the early 90s, talking about things like the size of the ROB, or the mere existence of prefetching, as though they're what determine performance. Look at what Samsung is discussing, sometimes as the main concern, sometimes as a throwaway.

So they take the existence of high quality directional branch prediction as given, their concerns are with performant indirect branch prediction. The take multi-strided prefetching as a given and augment it with more sophisticated mechanisms that try to prefetch pointer-based structures. They try (albeit not wonderfully) to ensure that the L1 and L2 prefetchers are working together rather than at cross purposes. They're using an exclusive L3, but in a manner that tries to track some degree of line history. They are worrying about dead lines and line placement. All the issues I've been talking about for years (and mostly had dismissed as academic nuttiness).

I like the paper because, like the classic RISC papers, it states in public (as opposed to something one can merely assume as common sense [hah!]) a new baseline for what a high performance industrial CPU has to implement.

The point, in other words, is not that Apple implements things the same way as Samsung, but that if Samsung consider, eg, a prefetcher optimized for pointer-based structures to be not merely an academic curiosity but something worth implementing, then chances are that ARM and Apple likewise have a prefetcher that tracks pointer-based structures.
(And Intel and AMD? WTF knows? Their self-imposed compatibility burden is so large, and their turnaround times so slow, that I've lost interest in most of what they are doing.)
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Modern coresMaynard Handley2020/07/22 09:03 AM
  Modern coresEtienne2020/07/22 10:15 AM
    Modern coresMaynard Handley2020/07/22 01:19 PM
      Modern coresanon2020/07/22 03:13 PM
        Modern coresMaynard Handley2020/07/22 05:29 PM
          Modern coresChester2020/07/22 10:59 PM
            Modern coresMaynard Handley2020/07/23 09:06 AM
              Modern coresChester2020/07/23 10:33 AM
              Modern coresDoug S2020/07/23 02:14 PM
      You are ignoring the effect of page size to cache way size (NT)Heikki Kultala2020/07/23 06:16 AM
  Modern coresanon2020/07/22 03:18 PM
    Modern coresUnmaskedUnderflow2020/07/23 07:50 AM
  Modern coresJouni Osmala2020/07/22 10:17 PM
Reply to this Topic
Body: No Text
How do you spell avocado?