By: James (no.delete@this.thanks.invalid), May 7, 2013 5:18 am
Room: Moderated Discussions
anon wrote:
> Modern Linux kernels are able to use 2MB pages transparently
> to the user process, merging and breaking pages
> as needed. Right now, if you boot a modern kernel, almost
> all anonymous memory that was requested in chunks of
> over 2MB is backed by 2MB pages, and this will likely be
> expanded to different classes of memory over time.
>
> doc: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Documentation/vm/transhuge.txt?id=refs/tags/next-20130507
none asked:
> Thanks. Has there been some studies on the impact in real life?
http://www.linux-kvm.org/wiki/images/9/9e/2010-forum-thp.pdf -- a pre-merge justification by Andrea Arcangeli, who wrote the feature.
He got up to 25% improvement on memory-heavy server benchmarks in a virtualised system where both host and guest used transparent hugepages, using an AMD system where all the TLB entries could use 2MB pages.
He got 2.5% improvement on a bare-metal GCC build (with a small tweak to GCC allocation, since it didn't use glibc allocations). He suggested this was "worst case", but again, this was on AMD.
> Modern Linux kernels are able to use 2MB pages transparently
> to the user process, merging and breaking pages
> as needed. Right now, if you boot a modern kernel, almost
> all anonymous memory that was requested in chunks of
> over 2MB is backed by 2MB pages, and this will likely be
> expanded to different classes of memory over time.
>
> doc: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Documentation/vm/transhuge.txt?id=refs/tags/next-20130507
none asked:
> Thanks. Has there been some studies on the impact in real life?
http://www.linux-kvm.org/wiki/images/9/9e/2010-forum-thp.pdf -- a pre-merge justification by Andrea Arcangeli, who wrote the feature.
He got up to 25% improvement on memory-heavy server benchmarks in a virtualised system where both host and guest used transparent hugepages, using an AMD system where all the TLB entries could use 2MB pages.
He got 2.5% improvement on a bare-metal GCC build (with a small tweak to GCC allocation, since it didn't use glibc allocations). He suggested this was "worst case", but again, this was on AMD.