lmbench is horribly broken

By: Exophase (exophase.delete@this.gmail.com), March 17, 2017 6:21 pm
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on March 17, 2017 5:56 pm wrote:
> No.
>
> The workload is actually entirely cached - there is no prefetching
> anywhere (except possibly the very first iteration).

Yes, you're right, I realized that right after posting that you said you setup the pages to point to the same physical page. And as usual I wished the forum had the ability to edit the posts.

But this is a superficial detail.

That and you actually are engaging the prefetcher with your TLB access patterns.

>
> The point of the workload is to see how well the TLB walker interacts with caching.
>
> The thing is, caching works. We know that. Anybody who dismisses locality of
> references is so far out to lunch that it's not worth talking to that person.
>
> Caching works particularly well for dense data structures, which is exactly what a page table
> walker is walking. Again, anybody who dismisses that is just crazy and/or incompetent.
>
> When it comes to TLB walking, you really have two very different main cases):
>
> (a) "dense" in the TLB: traditional streaming loads that take a lot of cache misses.
>
> (b) "sparse" in the TLB: the workload might even fit in the D$ (at least at some level), but it's
> so spread out that you take a lot of TLB misses, and the TLB activity is really noticeable.
>
> The thing is, (a) isn't even a worry. If you have a streaming load, you'll take TLB misses, but you'll
> take a lot more actual data cache misses unless your CPU core caches are seriously unbalanced.
>
> So (a) just isn't all that interesting a load for the TLB. You have a high enough hit-rate that the
> TLB miss won't show up compared to normal misses, if your TLB is just reasonable enough (and yes,
> that generally does mean that you have a L2 TLB - and pretty much everybody does these days).
>
> For (a), you want your TLB to not be ridiculously small, and you want the
> TLB fill to not suck too badly. But you really don't need to be all that clever,
> because the D$ misses outnumber the TLB misses by a huge margin.
>
> But (b) is interesting. And it's not actually all that hard to trigger on some loads. If you do a lot
> of pointer-chasing, you may well be in the situation that the workload fits in the cache to a fairly
> large degree, but it's "fragmented" enough in the address space that you have a high TLB pressure.
>
> And (b) is when the TLB walker really matters. The TLB costs aren't hidden by the
> "normal" data access costs. You'll see potentially huge TLB waling costs despite
> the fact that page tables are actually data structures that cache really well
.
>
> In fact, multi-level page tables (which is the common - and sanest - page table format) are
> really almost optimal for caching. They retain all the locality that the access pattern has,
> and improve it further by essentially compressing it by several bits. The top levels cache
> so well that caching even just a single entry at each level tends to capture almost all of
> the activity, and the last level is dense too, and works very well with caches.
>
> So a TLB walker that doesn't use the normal D$ for entry walking is basically useless crap.
>
> And if you do use the normal D$ for TLB lookup (perhaps limit it to just L2, to avoid L1 perturbations),
> you really can do very very well at TLB fills, and you basically have an almost infinitely-sized L3 TLB.
>
> And if you do TLB fills badly and don't take advantage of the nice caching behavior of
> a multi-level tree, your core is crap, and you shouldn't blame the benchmark for it.
>
> Linus

I don't disagree with you that TLB performance matters. Nobody here is defending a CPU that doesn't load the TLB through the caches. That has nothing to do with Wilco's criticism of lmbench. He wants a benchmark that measures memory latency without also measuring a dependent TLB miss that isn't in the cache. Do you not agree that there are workloads that will get a lot more cache misses than TLB miss + uncached TLB access? Or are you saying that lmbench's thrash initialize mode actually doesn't inflate the memory latency with uncached TLB accesses?
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
ARM A73 benchmarksSymmetry03/14/17 06:24 AM
  ARM A73 benchmarksPer Hesselgren03/14/17 07:18 AM
    ARM A73 benchmarks-latencyPer Hesselgren03/14/17 08:58 AM
      ARM A73 benchmarks-latencySymmetry03/14/17 10:12 AM
        ARM A73 benchmarks-latencyPer Hesselgren03/14/17 03:54 PM
          ARM A73 benchmarks-latencyWilco03/15/17 01:45 AM
            ARM A73 benchmarks-latencyPer Hesselgren03/15/17 02:57 AM
              ARM A73 benchmarks-latencyPer Hesselgren03/15/17 03:00 AM
                ARM A73 benchmarks-latencyPer Hesselgren03/15/17 03:01 AM
                  clickable linkMichael S03/15/17 04:05 AM
            ARM A73 benchmarks-latencyLinus Torvalds03/15/17 10:05 AM
              ARM A73 benchmarks-latencyIreland03/15/17 05:02 PM
              ARM A73 benchmarks-latencyGabriele Svelto03/16/17 03:45 AM
                ARM A73 benchmarks-latencyLinus Torvalds03/16/17 02:01 PM
                  lmbench is horribly brokenWilco03/16/17 04:57 PM
                    lmbench is horribly brokenLinus Torvalds03/16/17 06:49 PM
                      lmbench is horribly brokenLinus Torvalds03/17/17 01:10 PM
                        lmbench is horribly brokenLinus Torvalds03/17/17 01:52 PM
                        lmbench is horribly brokenExophase03/17/17 02:31 PM
                          lmbench is horribly brokenGabriele Svelto03/17/17 03:20 PM
                          lmbench is horribly brokenLinus Torvalds03/17/17 05:56 PM
                            lmbench is horribly brokenExophase03/17/17 06:21 PM
                              lmbench is horribly brokenLinus Torvalds03/17/17 06:43 PM
                                lmbench is horribly brokenIreland03/17/17 07:37 PM
                                  lmbench is horribly brokenbakaneko03/18/17 11:17 AM
                                    lmbench is horribly brokenIreland03/18/17 12:23 PM
                                      lmbench is horribly brokenanon03/18/17 07:35 PM
                                      lmbench is horribly brokenbakaneko03/21/17 08:08 AM
                                        lmbench is horribly brokenIreland03/21/17 03:14 PM
                                lmbench is horribly brokenGabriele Svelto03/18/17 04:01 PM
                                  accessing dram RichardC03/18/17 06:33 PM
                                lmbench is horribly brokenExophase03/18/17 04:26 PM
                                  lmbench is horribly brokenWilco03/18/17 05:40 PM
                                    benchmarking reality?Anon03/19/17 02:29 PM
                                    lmbench is horribly brokenLinus Torvalds03/19/17 04:25 PM
                                      mea culpa (lmbench is horribly broken)Linus Torvalds03/19/17 06:05 PM
                                        mea culpa (lmbench is horribly broken)Bill Broadley03/21/17 01:41 AM
                                          mea culpa (lmbench is horribly broken)Linus Torvalds03/21/17 09:01 AM
                                            mea culpa (lmbench is horribly broken)Linus Torvalds03/21/17 11:14 AM
                                            mea culpa (lmbench is horribly broken)Linus Torvalds03/21/17 05:03 PM
                                              mea culpa (lmbench is horribly broken)Etienne03/22/17 04:37 AM
                                              mea culpa (lmbench is horribly broken)Tim McCaffrey03/22/17 08:54 AM
                                                mea culpa (lmbench is horribly broken)Tim McCaffrey03/22/17 09:34 AM
                                                mea culpa (lmbench is horribly broken)Linus Torvalds03/22/17 10:35 AM
                                                  mea culpa (lmbench is horribly broken)Ireland03/22/17 12:11 PM
                                                    mea culpa (lmbench is horribly broken)Ireland03/22/17 12:26 PM
                                                    mea culpa (lmbench is horribly broken)rwessel03/22/17 03:03 PM
                                                      mea culpa (lmbench is horribly broken)Ireland03/22/17 03:35 PM
                                                  mea culpa (lmbench is horribly broken)Linus Torvalds03/22/17 01:35 PM
                                                    mea culpa (lmbench is horribly broken)Gabriele Svelto03/23/17 08:05 AM
                                                      mea culpa (lmbench is horribly broken)Linus Torvalds03/23/17 10:43 AM
                                                        mea culpa (lmbench is horribly broken)Gabriele Svelto03/23/17 01:56 PM
                                                          mea culpa (lmbench is horribly broken)Ireland03/23/17 02:36 PM
                                                  mea culpa (lmbench is horribly broken)Travis03/22/17 01:38 PM
                                              mea culpa (lmbench is horribly broken)anon03/22/17 07:22 PM
                                                mea culpa (lmbench is horribly broken)Travis03/22/17 08:57 PM
                                                  mea culpa (lmbench is horribly broken)anon03/23/17 12:44 AM
                                                    mea culpa (lmbench is horribly broken)Michael S03/23/17 05:59 PM
                                                      mea culpa (lmbench is horribly broken)Travis03/23/17 09:03 PM
                                              mea culpa (lmbench is horribly broken)anon03/23/17 01:14 AM
                                                mea culpa (lmbench is horribly broken)Linus Torvalds03/23/17 11:22 AM
                                                  Thank you. Associativity misses explain it.anon03/23/17 10:48 PM
                                                    Thank you. Associativity misses explain it.Linus Torvalds03/24/17 01:26 PM
                                                      Thank you. Associativity misses explain it.Ireland03/24/17 02:52 PM
                                  lmbench is horribly brokenLinus Torvalds03/19/17 03:51 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?