Pre-populating anonymous pages

By: Brendan (btrotter.delete@this.gmail.com), June 9, 2019 2:29 am
Room: Moderated Discussions
Hi,

Linus Torvalds (torvalds.delete@this.linux-foundation.org) on June 8, 2019 11:43 am wrote:
> Brendan (btrotter.delete@this.gmail.com) on June 8, 2019 2:55 am wrote:
> >
> > a) When a process calls "mmap()" the kernel looks at various things (primarily how much physical
> > RAM is currently free, but also things like whether meltdown mitigation are present, how fast/slow
> > swap space is, etc) to try to estimate the optimum number of pages to pre-populate.
>
> As Travis already pointed out, no this is not how it works.
>
> Linux never pre-populates mappings at all unless you ask for it (exception: special
> device mappings do tend to get created and fully populated at mmap time - think
> frame buffers etc where demand-paging just doesn't make a lot of sense).
>
> There are various reasons for that, some of them historical rather than technical, but
> perhaps the most obvious one is that it's actually somewhat common to never touch the
> actual pages at all, and pre-populating would very much be the wrong thing to do.
>
> "Why would you do such an insane thing?" I hear you say.
>
> The thing about mmap is that it's not just about mapping pages. In fact, you
> should think of it as mapping virtual memory first, and accessing the pages
> you mapped is just the common default action. But not the only action.
>
> Another use case is that you literally use mmap to first get a big chunk of virtual
> memory, and then - once you have reserved your VM area - you re-populate it (or unmap)
> it in chunks using MAP_FIXED within the area you carved out for yourself.
>
> The reasons can vary: you may need particular alignment over and beyond the page size, or you may just need
> to map things consecutively in the virtual memory area and cannot allow holes or other mappings in between.
>
> So then the "carve out VM using mmap, manage it using overmapping with MAP_FIXED" is the
> only way to set up VM mappings in user space if you have particular layout requirements (you
> could use MAP_FIXED _without_ carving out a VM area for yourself, but then you'd have to
> know exactly what the virtual memory layout is, which isn't realistic in the presense of
> dynamically linked libraries, threads doing their own mmap at the same time, etc etc).
>
> Now, arguably maybe that (less common) use of mmap as just a way to carve would have
> maybe merited a flag ("MAP_DONTPOPULATE"), but it ends up being the other way around:
> if you actually want to populate the mapping, use the MAP_POPULATE flag.

Wait...

I assumed (incorrectly) that it's not too horrible, Travis points out that it actually is a little horrible, then you point out that it's worse than Travis said because "I don't intend to use this area at all (I'm planning to carve it up)" is conflated with "I might use this memory but not soon"?

Should I assume that (because "mmap without MAP_POPULATE" is used for "I don't intend to use this area at all until after its carved up"), when an area that wasn't intended to be accessed is accidentally accessed, buggy software won't get any indication of the bug (e.g. a SIGSEGV) and it'll cause physical memory to be allocated instead?

> Because the truly unusual case is that you care when the pages get populated at all. And
> no, "performance" is not really generally an issue in real life, because real life doesn't
> mmap something and then just touch the pages once, and in real life there are often advantages
> to delaying the page allocations (including latency, even if throughput drops).
>
> Could we do random heuristics like "if it's size X, do Y"? Sure. But it's never really been
> a problem. We're good at populating memory. It's a very optimized path, and honestly, you aren't
> supposed to do mmap if you then just touch all the pages once. What would be the point?

A software developer writes some software that uses "mmap()" to get an area of memory, and many thousands of people use that software. The software developer can't know how many of those users have 1 GiB of RAM and how many have 1 TiB of RAM, can't know how much memory is being used by other processes initially (and can't know if/when that will change during the execution of their software or if/when they'll suddenly find themselves in "swap thrashing hell"), and typically doesn't even know which operating system their software will be run on (heck, maybe someone ports it via. DJGPP and it ends up running on MS-DOS). Based on an abundance of impossible to know information (primarily gut instinct combined with "whatever works for me, screw everyone else" logic) the software developer decides whether they feel like setting MAP_POPULATE or not; and then Linux treats this "guaranteed wrong some of the time" flag as an "all or nothing holy grail that must be upheld at all costs".

Hopefully you see how someone might get the impression that this isn't an ideal scenario (and is something that has the potential of being improved with some heuristics, possibly starting with "if there's already a pattern of consecutive writes to this area, and if there's lots of free physical RAM not being used to improve performance, maybe the risk of having to de-populate later is small enough to justify populating some extra pages while I'm already in the page fault handler").

> It so happens that CPU security bugs kind of screwed with us lately, and made it much more expensive
> to do page faults, but it's also the case that the benchmark that Travis pointed at is exactly
> what you should not use mmap for - if you only touch things once, what are you doing?

That benchmark shows the cost of accessing the memory the first time. If the memory is accessed millions of times it would still have the same "first access" costs; so that benchmark is probably (at least slightly) relevant for every process in user-space.

> Side note: populating dynamically has performance advantages too. One historical case is old Fortran code,
> which is written to not really have dynamic allocations. That was literally still a deal back when Linux started.
> So people would have these humongous (well, for the time - today they'd be considered tiny) static matrix
> allocations, and often use only a tiny tiny portion of it. So not pre-populating things was an enormous performance
> advantage, because you avoided doing the work for a large matrix that was never accessed.

Sure; not pre-populating can have performance advantages (e.g. if it's unlikely to be used and there isn't much/any free physical RAM); but for the exact same software under different conditions pre-populating can have performance advantages instead (e.g. if there's a huge amount of free physical RAM).

Let's put it another way...

What would happen if a new "MAP_RESERVED" flag was added (giving people who only want to split the area up later the option of getting better "crash on accidental access" behavior); and the existing "MAP_POPULATE" flag was expanded into a 3-bit field representing how aggressively the kernel should populate, where "POP_AGGRESSION_0" is a synonym for the existing "don't populate" and "POP_AGGRESSION_7" is a synonym for the existing "MAP_POPULATE"? All existing software would continue to behave exactly the same as it does now; but new/future software (and existing software after some tweaks) would be able to use "POP_AGGRESSION_1" to "POP_AGGRESSION_6" to influence "kernel tries to estimate the right thing to do based on actual current conditions"; so that software isn't stuck with "There are various reasons, some of them historical rather than technical" forever. Excluding developer time/code maintenance; would there be any downside?

- Brendan
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Pre-populating anonymous pagesTravis Downs2019/06/05 03:48 PM
  Pre-populating anonymous pagesJeff S.2019/06/05 07:03 PM
    Pre-populating anonymous pagesTravis Downs2019/06/06 06:11 AM
      Pre-populating anonymous pagesJeff S.2019/06/06 07:40 AM
        Pre-populating anonymous pagesTravis Downs2019/06/06 07:59 AM
          Pre-populating anonymous pagesJeff S.2019/06/06 08:19 AM
  Pre-populating anonymous pagesFoo_2019/06/05 11:30 PM
    Pre-populating anonymous pagesTravis Downs2019/06/06 05:59 AM
      Pre-populating anonymous pagesFoo_2019/06/06 06:56 AM
        Pre-populating anonymous pagesTravis Downs2019/06/06 08:02 AM
  Pre-populating anonymous pagesLinus Torvalds2019/06/06 10:01 AM
    Pre-populating anonymous pagesTravis Downs2019/06/07 01:16 PM
      Pre-populating anonymous pagesBrendan2019/06/08 01:55 AM
        Pre-populating anonymous pagesTravis Downs2019/06/08 07:18 AM
        Pre-populating anonymous pagesLinus Torvalds2019/06/08 10:43 AM
          Pre-populating anonymous pagesBrendan2019/06/09 02:29 AM
            Pre-populating anonymous pagesLinus Torvalds2019/06/10 10:20 AM
          Pre-populating anonymous pagesTravis Downs2019/06/17 08:18 AM
            Pre-populating anonymous pagesLinus Torvalds2019/06/18 03:28 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? ūüćä