Pre-populating anonymous pages

By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), June 8, 2019 11:43 am
Room: Moderated Discussions
Brendan (btrotter.delete@this.gmail.com) on June 8, 2019 2:55 am wrote:
>
> a) When a process calls "mmap()" the kernel looks at various things (primarily how much physical
> RAM is currently free, but also things like whether meltdown mitigation are present, how fast/slow
> swap space is, etc) to try to estimate the optimum number of pages to pre-populate.

As Travis already pointed out, no this is not how it works.

Linux never pre-populates mappings at all unless you ask for it (exception: special device mappings do tend to get created and fully populated at mmap time - think frame buffers etc where demand-paging just doesn't make a lot of sense).

There are various reasons for that, some of them historical rather than technical, but perhaps the most obvious one is that it's actually somewhat common to never touch the actual pages at all, and pre-populating would very much be the wrong thing to do.

"Why would you do such an insane thing?" I hear you say.

The thing about mmap is that it's not just about mapping pages. In fact, you should think of it as mapping virtual memory first, and accessing the pages you mapped is just the common default action. But not the only action.

Another use case is that you literally use mmap to first get a big chunk of virtual memory, and then - once you have reserved your VM area - you re-populate it (or unmap) it in chunks using MAP_FIXED within the area you carved out for yourself.

The reasons can vary: you may need particular alignment over and beyond the page size, or you may just need to map things consecutively in the virtual memory area and cannot allow holes or other mappings in between.

So then the "carve out VM using mmap, manage it using overmapping with MAP_FIXED" is the only way to set up VM mappings in user space if you have particular layout requirements (you could use MAP_FIXED _without_ carving out a VM area for yourself, but then you'd have to know exactly what the virtual memory layout is, which isn't realistic in the presense of dynamically linked libraries, threads doing their own mmap at the same time, etc etc).

Now, arguably maybe that (less common) use of mmap as just a way to carve would have maybe merited a flag ("MAP_DONTPOPULATE"), but it ends up being the other way around: if you actually want to populate the mapping, use the MAP_POPULATE flag.

Because the truly unusual case is that you care when the pages get populated at all. And no, "performance" is not really generally an issue in real life, because real life doesn't mmap something and then just touch the pages once, and in real life there are often advantages to delaying the page allocations (including latency, even if throughput drops).

Could we do random heuristics like "if it's size X, do Y"? Sure. But it's never really been a problem. We're good at populating memory. It's a very optimized path, and honestly, you aren't supposed to do mmap if you then just touch all the pages once. What would be the point?

It so happens that CPU security bugs kind of screwed with us lately, and made it much more expensive to do page faults, but it's also the case that the benchmark that Travis pointed at is exactly what you should not use mmap for - if you only touch things once, what are you doing?

Side note: populating dynamically has performance advantages too. One historical case is old Fortran code, which is written to not really have dynamic allocations. That was literally still a deal back when Linux started. So people would have these humongous (well, for the time - today they'd be considered tiny) static matrix allocations, and often use only a tiny tiny portion of it. So not pre-populating things was an enormous performance advantage, because you avoided doing the work for a large matrix that was never accessed.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Pre-populating anonymous pagesTravis Downs2019/06/05 04:48 PM
  Pre-populating anonymous pagesJeff S.2019/06/05 08:03 PM
    Pre-populating anonymous pagesTravis Downs2019/06/06 07:11 AM
      Pre-populating anonymous pagesJeff S.2019/06/06 08:40 AM
        Pre-populating anonymous pagesTravis Downs2019/06/06 08:59 AM
          Pre-populating anonymous pagesJeff S.2019/06/06 09:19 AM
  Pre-populating anonymous pagesFoo_2019/06/06 12:30 AM
    Pre-populating anonymous pagesTravis Downs2019/06/06 06:59 AM
      Pre-populating anonymous pagesFoo_2019/06/06 07:56 AM
        Pre-populating anonymous pagesTravis Downs2019/06/06 09:02 AM
  Pre-populating anonymous pagesLinus Torvalds2019/06/06 11:01 AM
    Pre-populating anonymous pagesTravis Downs2019/06/07 02:16 PM
      Pre-populating anonymous pagesBrendan2019/06/08 02:55 AM
        Pre-populating anonymous pagesTravis Downs2019/06/08 08:18 AM
        Pre-populating anonymous pagesLinus Torvalds2019/06/08 11:43 AM
          Pre-populating anonymous pagesBrendan2019/06/09 03:29 AM
            Pre-populating anonymous pagesLinus Torvalds2019/06/10 11:20 AM
          Pre-populating anonymous pagesTravis Downs2019/06/17 09:18 AM
            Pre-populating anonymous pagesLinus Torvalds2019/06/18 04:28 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?