Sandy Bridge L2 Cache

By: MS (ms.delete@this.lostcircuits.com), January 19, 2011 2:32 pm
Room: Moderated Discussions
MS (ms@lostcircuits.com) on 1/18/11 wrote:
---------------------------
>David Kanter (dkanter@realworldtech.com) on 1/18/11 wrote:
>---------------------------
>>MS (ms@lostcircuits.com) on 1/18/11 wrote:
>>---------------------------
>>>David Kanter (dkanter@realworldtech.com) on 1/18/11 wrote:
>>>---------------------------
>>>>MS (ms@lostcircuits.com) on 1/18/11 wrote:
>>>>---------------------------
>>>>>David Kanter (dkanter@realworldtech.com) on 1/18/11 wrote:
>>>>>---------------------------
>>>>>
>>>>>
>>>>>>
>>>>>>Look at the bandwidth for a 512KB or 1MB data set. That's large enough to spill
>>>>>>into the L2 cache (128KB or 256KB/core). The respective bandwidth numbers are ~300GB/s
>>>>>>and 250GB/s for a 3.4GHz part that can hit a peak of 3.8GHz with 4 cores active.
>>>>>>
>>>>>>300GB/s --> 75.0GB/s per core --> 19.7-22.1B per core*cycle
>>>>>>
>>>>>>250GB/s --> 62.5GB/s per core --> 16.4-18.4B per core*cycle
>>>>>>
>>>>>>Both of these numbers suggest that the theoretical maximum must be above 16B/cycle,
>>>>>>since the test will not hit 100% of peak.
>>>>>>
>>>>>
>>>>>>David
>>>>>
>>>>>You cannot fit a 512kB data set into a 256kB cache, the >numbers you were looking
>>>>>at appear to be LLC rather than L2. L2 numbers are >everything from 32kB to 256.
>>>>
>>>>My understanding is that it's a total of 512KB that is split between the 4 different
>>>>caches. I don't really place much faith in tools like Sandra in the first place,
>>>>and they certainly do little to explain the precise nature of the tests.
>>>>
>>>>>I ran the benchmark, though on a single thread and I am >getting 75GB/sec for both
>>>>>L1D and L2, which comes out to 21.8 bytes/ cycle (at >3.7GHz) so you are probably right about the 32-byte path.
>>>>
>>>>Do you have any way to verify that the hits are occurring in the L2 and not L1D?
>>>>Or are you using a larger data set that is only fully resident in the L2?
>>>>
>>>>David
>>>
>>>Sandra uses an adaptation of Stream just like everybody else and I am actually
>>>talking quite often to Adrian regarding some of the benchmarks and for the most
>>>part they are just as good as any other esoteric bench.
>>>
>>>Each core has a 256 kB discrete L2 cache which gives a combined 1MB L2 but they
>>>are discrete and you cannot span data across them, which is the fundamental difference
>>>to a shared cache like the L3 or LLC.
>>
>>I'm aware : )
>>
>>>If it is a data set, then that is one "coherent data structure" which means that
>>>if there are discrete caches for the different cores, the data set cannot span across
>>>core boundaries.
>>>In other words, the max size that fits into the L2 cache is 256kB
>>>for each data set. Similarly, any data structure that is larger than 32kB will
>>>not fit into the L1D but has to go into the L2 cache.
>>
>>Yes, that's true, but you can fit 32KB of your large structure into the L1 cache.
>>If many of your accesses fall within the 32KB in the L1 cache, the bandwidth numbers
>>may be skewed upwards. IOW, you want to be sure that the accesses are missing in
>>the L1 cache and hitting in the L2 cache. Although the L1 read bandwidth is similar
>>to the L2 bandwidth, the latency will have an impact on achievable bandwidth as will the number of in-flight misses.
>>
>>Put another way - there are plenty of ways to access a 256KB data structure in
>>such a way that the majority of reads are serviced from the L1 cache. Especially with a clever prefetcher.
>>
>>I would think that Sandra is designed to avoid such things, but it's frankly very difficult to tell.
>>
>>>Does that answer your question?
>>
>>Not quite. It sounds like you were using a 256KB data set with a perfectly strided
>>access pattern for your test, is that correct?
>>
>>DK
>
>As I mentioned, Sandra uses STREAM, that is a linear access pattern, so yes.
>
I checked, and you were, indeed, correct with respect to the size of the data blocks in Sandra which is given in form of the aggregate block size, i.e. test block x number of threads if you run the default configuration. When you disable multithreading and hyperthreading, then the size refers to the individual data set.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
AMD chief executive resigns; CFO takes interim jobsomeone01/10/11 02:44 PM
  AMD chief executive resigns; CFO takes interim joba reader01/10/11 03:09 PM
    AMD chief executive resigns; CFO takes interim jobsomeone01/10/11 03:21 PM
      AMD chief executive resigns; CFO takes interim jobMr. Camel01/10/11 03:39 PM
        AMD chief executive resigns; CFO takes interim jobsomeone01/10/11 05:03 PM
          AMD chief executive resigns; CFO takes interim jobinf6401/11/11 08:43 AM
            AMD chief executive resigns; CFO takes interim jobsomeone01/11/11 09:21 AM
              AMD chief executive resigns; CFO takes interim jobfanboi01/11/11 03:42 PM
      AMD chief executive resigns; CFO takes interim jobAzazel01/10/11 09:40 PM
    AMD chief executive resigns; CFO takes interim jobPun Zu01/11/11 01:48 AM
      AMD chief executive resigns; CFO takes interim jobsomeone01/11/11 08:27 AM
        AMD chief executive resigns; CFO takes interim joba reader01/11/11 06:36 PM
        AMD chief executive resigns; CFO takes interim jobBrett01/11/11 07:04 PM
          AMD chief executive resigns; CFO takes interim jobslacker01/11/11 07:18 PM
            AMD chief executive resigns; CFO takes interim jobBrett01/11/11 07:55 PM
              Facts or fiction?David Kanter01/11/11 08:02 PM
                Facts or fiction?Brett01/11/11 09:41 PM
                  Facts or fiction?Aaron Spink01/12/11 02:42 AM
                  Facts or fiction?Michael S01/12/11 03:01 AM
                Facts or fiction?Michael S01/12/11 02:25 AM
              AMD chief executive resigns; CFO takes interim jobsomeone01/11/11 11:08 PM
                AMD chief executive resigns; CFO takes interim jobBrett01/12/11 12:07 AM
                  AMD chief executive resigns; CFO takes interim jobJS01/12/11 12:41 AM
                  FactsDavid Kanter01/12/11 01:00 PM
                  AMD chief executive resigns; CFO takes interim jobMark Roulo01/12/11 01:56 PM
                    AMD chief executive resigns; CFO takes interim jobBrett01/12/11 04:49 PM
                      AMD chief executive resigns; CFO takes interim jobMark Roulo01/12/11 05:43 PM
                        AMD chief executive resigns; CFO takes interim jobDavid Kanter01/12/11 11:07 PM
                        AMD chief executive resigns; CFO takes interim jobMichael S01/13/11 04:22 AM
                          AMD chief executive resigns; CFO takes interim jobBrett01/13/11 01:53 PM
                  AMD chief executive resigns; CFO takes interim jobdc01/12/11 07:02 PM
          Reality checkDavid Kanter01/11/11 07:58 PM
            Reality checkBrett01/11/11 10:14 PM
              Reality checkJack01/11/11 11:07 PM
              Reality checkRohit01/12/11 09:09 AM
            I don't think it is 90% CPU, 2x the graphics for AMDMark Roulo01/12/11 10:39 AM
              I don't think it is 90% CPU, 2x the graphics for AMDRichard Cownie01/12/11 07:46 PM
          AMD chief executive resigns; CFO takes interim jobAzazel01/11/11 09:22 PM
            AMD chief executive resigns; CFO takes interim jobMegol01/12/11 11:14 AM
            AMD chief executive resigns; CFO takes interim jobDavid Kanter01/12/11 01:19 PM
              AMD chief executive resigns; CFO takes interim jobAzazel01/12/11 02:34 PM
          Brazos positioned at ~$500, not $250Brett01/12/11 05:58 PM
            Globalfoundries doubles investment spendingBrett01/12/11 06:17 PM
              Globalfoundries doubles investment spendingMark Roulo01/12/11 07:31 PM
                Globalfoundries doubles investment spendingDavid Kanter01/12/11 10:31 PM
            Brazos positioned at ~$500, not $250David Kanter01/12/11 09:29 PM
            Brazos positioned at ~$500, not $250Azazel01/12/11 11:51 PM
            Brazos positioned at ~$500, not $250nemlis01/13/11 02:44 PM
              Brazos positioned at ~$500, not $250Brett01/13/11 03:37 PM
                SandyBridge IGPU, Llano and RadeonsMark Roulo01/13/11 04:00 PM
                  SandyBridge IGPU, Llano and Radeonszzou01/16/11 05:52 PM
            Brazos positioned at ~$500, not $250Richard Cownie01/14/11 10:14 AM
              Brazos positioned at ~$500, not $250Dan Fay01/14/11 10:40 AM
            Brazos positioned at ~$500, not $250Brett01/15/11 01:06 PM
      possible candidates Eric Bron01/12/11 02:51 PM
        possible candidates Azazel01/13/11 02:54 AM
  AMD chief executive resigns; CFO takes interim jobEduardoS01/11/11 02:03 AM
  Waiting for a Bulldozer benchmark?01/11/11 02:09 AM
    Waiting for a Bulldozer benchmarkMichael S01/11/11 06:22 AM
    Waiting for a Bulldozer benchmarkAzazel01/11/11 06:53 AM
      Waiting for a Bulldozer benchmarkAnon01/11/11 07:22 PM
        Waiting for a Bulldozer benchmarkAzazel01/11/11 09:10 PM
      Waiting for a Bulldozer benchmarkdc01/11/11 11:42 PM
        Waiting for a Bulldozer benchmarkAzazel01/12/11 02:43 AM
      Waiting for a Bulldozer benchmarkBrett01/12/11 12:32 AM
    Waiting for a Bulldozer benchmarkBrett01/14/11 05:41 PM
      Waiting for a Bulldozer benchmarkrwessel01/15/11 04:31 AM
        Waiting for a Bulldozer benchmarkBrett01/15/11 11:27 AM
          Waiting for a Bulldozer benchmarkanon01/15/11 12:17 PM
            Waiting for a Bulldozer benchmarkEduardoS01/15/11 03:17 PM
              Waiting for a Bulldozer benchmarkBrett01/15/11 04:19 PM
                Waiting for a Bulldozer benchmarkEduardoS01/15/11 04:30 PM
                  Waiting for a Bulldozer benchmarkanon01/15/11 06:24 PM
                Bulldozer AVX units are shared...Mark Roulo01/16/11 11:41 AM
                  2 or 3 ??01/16/11 12:53 PM
                    2 or 3 ?Mark Roulo01/16/11 01:07 PM
                    2 or 3 ?Michael S01/16/11 01:30 PM
                      2 or 3 ?Dan Fay01/16/11 03:18 PM
                        FMADavid Kanter01/17/11 03:35 PM
                      2 or 3 ?Eric Bron01/16/11 04:21 PM
                        2 or 3 ?Michael S01/17/11 02:47 AM
                          2 or 3 ?Eric Bron01/17/11 04:17 AM
                            2 or 3 ?Michael S01/17/11 04:27 AM
                              2 or 3 ?Eric Bron01/17/11 04:33 AM
                                2 or 3 ?MS01/17/11 08:30 PM
                                  2 or 3 ?Eric Bron01/18/11 12:44 AM
                                    2 or 3 ?MS01/18/11 09:11 AM
                                      2 or 3 ?Eric Bron01/18/11 10:54 AM
                                        2 or 3 ?MS01/18/11 11:59 AM
                                    Sandy Bridge L2 CacheDavid Kanter01/18/11 02:46 PM
                                      Sandy Bridge L2 CacheMS01/18/11 03:32 PM
                                        Sandy Bridge L2 CacheDavid Kanter01/18/11 03:58 PM
                                          Sandy Bridge L2 CacheMS01/18/11 05:03 PM
                                            Sandy Bridge L2 CacheDavid Kanter01/18/11 06:10 PM
                                              Sandy Bridge L2 CacheMS01/18/11 08:11 PM
                                                Sandy Bridge L2 CacheMS01/19/11 02:32 PM
                                                  Sandy Bridge L2 CacheDavid Kanter01/20/11 02:34 AM
                                                    Sandy Bridge L2 CacheMS01/20/11 08:32 AM
                                                      Sandy Bridge L2 CacheEric Bron01/20/11 09:39 AM
                                                        Sandy Bridge L2 CacheMS01/20/11 02:35 PM
                                                          Sandy Bridge L2 CacheEric Bron01/20/11 02:58 PM
                                                            Sandy Bridge L2 CacheMS01/21/11 10:01 AM
                                                              Sandy Bridge L2 CacheEric Bron01/21/11 11:47 AM
                                                    Sandy Bridge L2 CacheMS01/21/11 05:09 PM
                                      Sandy Bridge L2 CacheEric Bron01/18/11 04:39 PM
                                        Sandy Bridge L2 CacheDavid Kanter01/18/11 06:19 PM
                                          Sandy Bridge L2 CacheMichael S01/19/11 11:12 AM
                                            Sandy Bridge L2 Cacheanonymous01/19/11 11:23 AM
                                              Sandy Bridge L2 CacheMichael S01/19/11 12:24 PM
                          2 or 3 ?EduardoS01/17/11 01:24 PM
                  Bulldozer AVX units are shared...Dan Fay01/16/11 01:27 PM
                  Bulldozer AVX units are shared...anon01/16/11 03:36 PM
                  Bulldozer AVX units are shared...Axel01/20/11 11:33 AM
                    Post and HTML tagsDavid Kanter01/20/11 12:41 PM
                    Bulldozer AVX units are shared...David Kanter01/20/11 12:49 PM
                      Bulldozer AVX units are shared...Axel01/20/11 02:18 PM
                        Bulldozer AVX units are shared...David Kanter01/20/11 05:12 PM
                          Thanks for the clarifications (NT)Axel01/21/11 06:53 AM
                      Bulldozer AVX units are shared...Foo_01/21/11 06:21 AM
                    Bulldozer AVX units are shared...Axel01/20/11 01:05 PM
                      Bulldozer AVX units are shared...Eric Bron01/20/11 04:16 PM
                        Bulldozer AVX units are shared...Axel01/21/11 06:51 AM
                          Bulldozer AVX units are shared...Eric Bron01/21/11 11:35 AM
                        Some SPECfp_base 2006 resultsTemp02/01/11 02:17 PM
          Waiting for a Bulldozer benchmarkAntti-Ville Tuunainen01/15/11 06:25 PM
          Waiting for a Bulldozer benchmarkHiTEK01/16/11 02:49 AM
            Waiting for a Bulldozer benchmarkAntti-Ville Tuunainen01/16/11 09:27 AM
              Waiting for a Bulldozer benchmarkHiTEK01/16/11 11:12 AM
            Waiting for a Bulldozer benchmarkDavid Kanter01/17/11 07:01 PM
              Waiting for a Bulldozer benchmarkHiTEK01/19/11 05:44 AM
                Sandy Bridge EP and EXDavid Kanter01/19/11 10:01 AM
                  Sandy Bridge EP and EXanonymous01/19/11 11:37 AM
                    Sandy Bridge EP and EXDavid Kanter01/19/11 11:39 AM
          Waiting for a Bulldozer benchmark?01/16/11 09:29 AM
            Waiting for a Bulldozer benchmarkanon01/16/11 03:51 PM
              Waiting for a Bulldozer benchmark?01/17/11 02:50 AM
                Waiting for a Bulldozer benchmarkanon01/17/11 03:57 AM
                  Waiting for a Bulldozer benchmark?01/17/11 06:07 AM
                    Waiting for a Bulldozer benchmarkanon01/17/11 04:41 PM
            Waiting for a Bulldozer benchmarkMatt Waldhauer01/17/11 07:02 AM
              Quite differentDavid Kanter01/17/11 07:48 PM
                Quite differentAntti-Ville Tuunainen01/17/11 08:53 PM
                  Quite differentMichael S01/18/11 01:37 AM
                Quite differentMichael S01/18/11 01:45 AM
                  Quite differentanon01/18/11 02:59 AM
                Quite differentMatt Waldhauer01/18/11 02:02 AM
                  No, it is functionally equivalent to a trace cache?01/18/11 03:46 AM
                    Similar function, different useDavid Kanter01/18/11 12:24 PM
                      Similar function, different use?01/19/11 05:17 AM
                        Similar function, different useDavid Kanter01/19/11 10:24 AM
                          Similar function, different use?01/19/11 01:23 PM
                            Similar function, different useDavid Kanter01/19/11 04:58 PM
                              Similar function, different useDan Downs01/19/11 05:36 PM
                              Similar function, different useAxel01/20/11 03:42 AM
                              Similar function, different use?01/21/11 05:53 AM
                                Not a good ideaDavid Kanter01/21/11 11:56 AM
                                  Not a good idea?01/21/11 02:41 PM
                                Similar function, different useEric Bron01/21/11 02:09 PM
          Some numbers (and a lot of speculation)Mark Roulo01/16/11 11:37 AM
            Some numbers (and a lot of speculation)anonymous01/16/11 12:27 PM
              Some numbers (and a lot of speculation)Mark Roulo01/16/11 12:30 PM
                Some numbers (and a lot of speculation)anonymous01/16/11 01:59 PM
  ArsTechnica/Jon Stokes' takelars01/12/11 10:23 PM
    ArsTechnica/Jon Stokes' takeAzazel01/12/11 11:46 PM
      ArsTechnica/Jon Stokes' takeMatt Waldhauer01/14/11 02:28 AM
  Dirk Meyer's achievement?Azazel01/14/11 06:58 AM
    Dirk Meyer's achievement?MS01/14/11 07:20 PM
      Dirk Meyer's achievement?Azazel01/15/11 02:52 AM
    Dirk Meyer's achievement?mpx01/15/11 02:21 AM
    AMD accomplishmentsDavid Kanter01/18/11 01:17 PM
      AMD accomplishmentsAzazel01/19/11 02:46 AM
        AMD accomplishmentsDavid Kanter01/19/11 11:53 AM
          AMD accomplishmentsBrett01/19/11 04:01 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?