Cache associativity

By: Linus Torvalds (torvalds.delete@this.osdl.org), August 16, 2005 11:48 am
Room: Moderated Discussions
David Kanter (dkanter@realworldtech.com) on 8/16/05 wrote:
>
>It sounds like you really prefer highly associative caches.

I definitely do. With some nice FP loops you may be able
to work around direct-mapped caches, and argue that you
can make the cache sufficiently faster that it's worth
the hoops the simpler/faster hardware makes you jump
through. I suspect that embedded people might have the
same argument.

With general-purpose programming, the pain is just too
big. You get a lot of cache misses due to way contention.

There's tons of data on this. If you want spec D$ miss
rates, see for example

http://www.cs.wisc.edu/multifacet/misc/spec2000cache-data/new_tables/specint_64-amean.tab

which says that for D$ there's about 25% more misses
for a direct-mapped cache than for a two-way one in the
L1 cache size range (the difference is even bigger for I$,
but the miss numbers there are smaller, of course).

And that's ignoring the worst case - that's just average.

So I personally would want at least 4-way in the L1, and
as much as possible in the L2. And if full associativity
doesn't work out, then some mixing in of other bits to hash
the thing around to avoid common alignment-induced "hot
ways", that sounds like a good idea to me (people seem to
call it "pseudo-associative").

One of the things I personally like about highly associative
caches is the graceful degradation. I'd much rather
have a system that tends to slow down more gracefully than
fall of a steep cliff ("glass jaw") when something bad
happens. A direct-mapped cache basically is asking for
trouble - it may perform fine "on average", but then it has
nasty situations where it really sucks.

Me, I'll take "consistently good" over "really really good
if all the planets align correctly" any day. When it comes
to caches, that means that I'd much rather take a two-
cycle L1 that is big and has high associativity over a
single-cycle small one. Even if the single-cycle one then
runs like a bat out of hell when things go the rigt way.

I guess this is all the same argument that make me prefer
a P-M over a P4. "Plodding and dependable workhorse" is
better than a sprinter that hits a brick wall every once in
a while.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
It's official - Hitachi's chipset not overly impressivePaul DeMone08/11/05 11:58 AM
  It's official - Hitachi's chipset not overly imprepigdog08/11/05 02:05 PM
    chipset vs OS/compilerPaul DeMone08/12/05 12:06 PM
      chipset vs OS/compilerAnonymous Donkey08/12/05 06:25 PM
      chipset vs OS/compilerAndi Kleen08/13/05 06:03 AM
        chipset vs OS/compilerPaul DeMone08/13/05 06:55 AM
          chipset vs OS/compilerwho cares08/13/05 07:05 PM
            chipset vs OS/compilerPaul DeMone08/13/05 10:16 PM
              chipset vs OS/compilerManfred08/14/05 05:16 AM
                chipset vs OS/compilerPaul DeMone08/14/05 06:40 AM
                  chipset vs OS/compilerLinus Torvalds08/14/05 07:48 AM
                    chipset vs OS/compilerPaul DeMone08/14/05 09:48 AM
                      chipset vs OS/compilerLinus Torvalds08/14/05 10:07 AM
                        chipset vs OS/compilerAnonymous Donkey08/14/05 05:10 PM
                          chipset vs OS/compilerAnonymous Frog08/14/05 07:29 PM
                            chipset vs OS/compilerAnonymous Donkey08/14/05 08:36 PM
                            LLVMRob Thorpe08/16/05 11:37 AM
                        32/64 bit compilationRob Thorpe08/16/05 11:12 AM
                    linux mallocRicardo Bugalho08/14/05 10:42 AM
                      linux mallocLinus Torvalds08/14/05 03:10 PM
                        linux mallocRicardo Bugalho08/14/05 04:15 PM
                          linux mallocAndi Kleen08/15/05 02:37 AM
                            linux mallocAnonymous08/15/05 05:37 PM
                              linux mallocAndi Kleen08/16/05 12:38 AM
              chipset vs OS/compilerwho cares08/14/05 12:12 PM
                chipset vs OS/compilerS. Rao08/15/05 10:36 AM
                  chipset vs OS/compilerArun Ramakrishnan08/15/05 02:42 PM
  It's official - Hitachi's chipset not overly impreChuck08/12/05 12:18 PM
    I believe ...leonov08/12/05 12:26 PM
      I believe ...Andi Kleen08/13/05 06:07 AM
        I believe ...Paul DeMone08/13/05 07:41 AM
          I believe ...Andi Kleen08/13/05 02:08 PM
            I believe ...Paul DeMone08/13/05 02:29 PM
              I believe ...David Kanter08/13/05 08:55 PM
                I believe ...Andi Kleen08/14/05 05:50 AM
              I believe ...Andi Kleen08/14/05 05:47 AM
        Children are the future.john evans08/13/05 06:30 PM
          Children are the future.Andi Kleen08/14/05 07:29 AM
            Changes to PathscaleDavid Kanter08/14/05 09:53 AM
              Changes to PathscaleAndi Kleen08/14/05 12:06 PM
                Changes to PathscaleMichael_S08/15/05 03:02 AM
                  Changes to PathscaleAndi Kleen08/15/05 03:37 PM
                  Changes to PathscaleLinus Torvalds08/15/05 06:18 PM
                    Changes to Pathscalejohn evans08/15/05 09:49 PM
                      Changes to PathscaleLinus Torvalds08/16/05 08:28 AM
                        Cache associativityDavid Kanter08/16/05 09:52 AM
                          Cache associativityLinus Torvalds08/16/05 11:48 AM
                            Cache associativityDavid Kanter08/16/05 12:14 PM
                              Cache associativity and virtualizationslim08/16/05 08:39 PM
                                Cache associativity and virtualizationDavid Kanter08/16/05 08:50 PM
                                Cache associativity and virtualizationrwessel08/16/05 10:27 PM
                        Changes to Pathscalejohn evans08/16/05 09:52 PM
                          Oh, one more thing.john evans08/16/05 10:21 PM
                            Sharing is tough!David Kanter08/17/05 07:21 AM
                              Sharing is tough!john evans08/17/05 08:24 PM
                    Changes to PathscaleMichael S08/16/05 01:33 AM
                      Opteron load reorderingIlleglWpns08/16/05 10:30 PM
                        Opteron load reorderingAnonymous08/17/05 04:36 PM
                          Opteron load reorderingIlleglWpns08/17/05 05:00 PM
                            Opteron load reorderingAnonymous08/18/05 01:12 AM
                              Opteron load reorderingIlleglWpns08/18/05 01:37 AM
                A compiler for OpteronRob Thorpe08/16/05 11:52 AM
  It's official - Hitachi's chipset not overly impressiveJosé Javier zarate08/23/05 08:51 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?