By: Travis Downs (travis.downs.delete@this.gmail.com), June 17, 2019 9:18 am
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on June 8, 2019 11:43 am wrote:
> ... honestly, you aren't supposed to do mmap if you then just touch all the pages once.
> What would be the point?
>
> It so happens that CPU security bugs kind of screwed with us lately, and made it much more expensive
> to do page faults, but it's also the case that the benchmark that Travis pointed at is exactly
> what you should not use mmap for - if you only touch things once, what are you doing?
Indeed, touching once is almost never a valid use case. The actual case is often touching it twice: once to write it, and then once to read it. And yes, it's not one byte per page but usually the entire page.
Think of a short lived process that allocates a huge buffer, transcodes/compresses/sorts/whatever something into that buffer, then writes it out.
Even with Spectre and friends, this is still fairly fast taking a fault on every page: I measure a bit over 1,400 ns per page. It corresponds to ~2.8 GB/s when just faulting in pages. Most transcoding/compression/whatever runs at much less than 2.8 GB/s so this usually doesn't matter. However I have some stuff that does more than that, so in that case 2.8 GB/s matters. Before Meltdown the "fault in" speed limit used to be closer to 10 GB/s, so I guess the fault time used to be closer to 400 ns. That is roughly consistent, if a bit slower, with syscall slowdown I observed after spec + melt.
In any case, madvise(MADV_HUGEPAGE) is the way to go when I can do it: the implied speed of fault in is over 300 GB/s then (probably the true speed is even higher, at this point even touching one byte per 4k page adds to the cost)!
Just to anticipate the complaint: yes, the application can often be written to be smarter: allocating a smaller buffer and periodically flushing it out has benefits, only one of which is the need to fault in many fewer pages.
> ... honestly, you aren't supposed to do mmap if you then just touch all the pages once.
> What would be the point?
>
> It so happens that CPU security bugs kind of screwed with us lately, and made it much more expensive
> to do page faults, but it's also the case that the benchmark that Travis pointed at is exactly
> what you should not use mmap for - if you only touch things once, what are you doing?
Indeed, touching once is almost never a valid use case. The actual case is often touching it twice: once to write it, and then once to read it. And yes, it's not one byte per page but usually the entire page.
Think of a short lived process that allocates a huge buffer, transcodes/compresses/sorts/whatever something into that buffer, then writes it out.
Even with Spectre and friends, this is still fairly fast taking a fault on every page: I measure a bit over 1,400 ns per page. It corresponds to ~2.8 GB/s when just faulting in pages. Most transcoding/compression/whatever runs at much less than 2.8 GB/s so this usually doesn't matter. However I have some stuff that does more than that, so in that case 2.8 GB/s matters. Before Meltdown the "fault in" speed limit used to be closer to 10 GB/s, so I guess the fault time used to be closer to 400 ns. That is roughly consistent, if a bit slower, with syscall slowdown I observed after spec + melt.
In any case, madvise(MADV_HUGEPAGE) is the way to go when I can do it: the implied speed of fault in is over 300 GB/s then (probably the true speed is even higher, at this point even touching one byte per 4k page adds to the cost)!
Just to anticipate the complaint: yes, the application can often be written to be smarter: allocating a smaller buffer and periodically flushing it out has benefits, only one of which is the need to fault in many fewer pages.
Topic | Posted By | Date |
---|---|---|
Pre-populating anonymous pages | Travis Downs | 2019/06/05 04:48 PM |
Pre-populating anonymous pages | Jeff S. | 2019/06/05 08:03 PM |
Pre-populating anonymous pages | Travis Downs | 2019/06/06 07:11 AM |
Pre-populating anonymous pages | Jeff S. | 2019/06/06 08:40 AM |
Pre-populating anonymous pages | Travis Downs | 2019/06/06 08:59 AM |
Pre-populating anonymous pages | Jeff S. | 2019/06/06 09:19 AM |
Pre-populating anonymous pages | Foo_ | 2019/06/06 12:30 AM |
Pre-populating anonymous pages | Travis Downs | 2019/06/06 06:59 AM |
Pre-populating anonymous pages | Foo_ | 2019/06/06 07:56 AM |
Pre-populating anonymous pages | Travis Downs | 2019/06/06 09:02 AM |
Pre-populating anonymous pages | Linus Torvalds | 2019/06/06 11:01 AM |
Pre-populating anonymous pages | Travis Downs | 2019/06/07 02:16 PM |
Pre-populating anonymous pages | Brendan | 2019/06/08 02:55 AM |
Pre-populating anonymous pages | Travis Downs | 2019/06/08 08:18 AM |
Pre-populating anonymous pages | Linus Torvalds | 2019/06/08 11:43 AM |
Pre-populating anonymous pages | Brendan | 2019/06/09 03:29 AM |
Pre-populating anonymous pages | Linus Torvalds | 2019/06/10 11:20 AM |
Pre-populating anonymous pages | Travis Downs | 2019/06/17 09:18 AM |
Pre-populating anonymous pages | Linus Torvalds | 2019/06/18 04:28 PM |