By: Linus Torvalds (torvalds.delete@this.osdl.org), May 13, 2007 2:46 pm
Room: Moderated Discussions
Tzvetan Mikov (tzvetanmi@yahoo.com) on 5/13/07 wrote:
>
>Can you please elaborate on the practical 1 GB limit a
>little further ?
Since the i386 has a 32-bit total limit on virtual
address space (the segmentation doesn't extend that, it
all works within the 4GB limit), and since user space needs
to be mapped in that area too, you effectively have the
requirement that
- user space virtual memory + kernel virtual memory = 4GB
Some other architectures effectively have separate virtual
memory spaces for user and kernel space, so they don't
necessarily need to be 4GB combined, they can be separately
4GB each. On x86, that isn't practical (you can emulate it
by switching page tables around a lot, and yes, it's been
done, but despite the fact that Intel makes TLB lookups
very fast, it's quite nasty for performance!)
And the thing is, user space really does want its own
virtual memory too. Under Linux, you can choose how to
split up the 4GB address space, but on the whole, you'd
actually prefer to give user space as much as possible,
since
- user space tends to use reasonably sparse virtual
address spaces (loadable libraries in one place, code,
data, stack in other areas etc)
- quite often, you have one "major" app, which wants to
use as much of the physical memory as possible, eg a
database, so you actually want user space to map as much
of the physical memory too!
So Linux defaults to giving user space 3GB, and just 1GB
for the kernel. That's basically the "user space is more
important in the end" version. I think WinXP does a 2:2GB
split.
So basically, the kernel really never has 4GB of virtual
memory. It has 1GB (possibly 2GB).
It also needs to map in PCI memory-mapped stuff etc, so in
practice, it's actually less than that. And then the kernel
will need some more virtual memory for its own use, eg
things like loading modules, and just for the window to do
physical addresses bigger than it can map directly.
So in practice, in Linux, when you had more than about
920MB (I forget the exact number), you ended up maxing out
what the kernel could map directly.
And once you max that out, think about what happens for
a moment...
Once you start having to access physical memory through
some kind of HIGHMEM.SYS window (which is what PAE is),
and cannot map it all, you can no longer keep normal
pointers to such memory around. Instead, you are basically
using a really ugly and strange segmented architecture,
where you keep some kind of indirect pointer, and every
time before you use it, you have to map it into the virtual
address space, access it, and then unmap it again.
So performance plummets, and the code actually gets really
nasty too, so you don't actually use the high memory for
any random data, you only use it for special stuff. As
an example, you'd use it for disk caching (that's what 90%
of all HIGHMEM.SYS usage was too - a lot of people just
set it all - or at least a big chunk of it - aside as a
harddisk cache, because so few programs could use it very
well for anything else).
Linux used it for more than disk caching, but the point
is, since it's special memory that you have to access
through a small window, or can just map into user space,
it's not generally usable any more.
And yes, there were serious problems. In theory, you
can have 64GB of RAM with Linux on a PAE x86 box. In
practice, it seldom worked very well past the 4GB mark,
so PAE itself was almost totally useless. The reason was
that once you had more than 4GB of memory, you usually
had filled out a large chunk of the easily accessible
memory with just all the data structures to keep track
of the rest of memory (that's exaggerated, but it's not
entirely off).
So you literally were in this situation where you had
tons of memory that could be used for some things,
but you ran out of memory for other basic bookkeeping
stuff.
So some people used 16GB of RAM, and by limiting their
working set to cases where the HIGHMEM stuff was fine,
and they didn't need a whole lot of other memory, they
made it work. But the code was (and is) grotty and pretty
disgusting, and performs much worse than just allowing
anything to use any of the RAM.
And the limit literally isn't the 4GB mark.
>So, in practical terms, is there benefit from using
>64-bit OS on a machine with, lets say, 2-3 GB RAM ?
Absolutely. No question what-so-ever. If you have 2GB
of RAM on a 64-bit architecture, you can access it all
easily in the kernel, and none of it is limited to just
some special use. And a single big process can also use
it effectively at the same time, so a database can
actually have it all mapped without having to play
windowing games in user space either.
Linus
>
>Can you please elaborate on the practical 1 GB limit a
>little further ?
Since the i386 has a 32-bit total limit on virtual
address space (the segmentation doesn't extend that, it
all works within the 4GB limit), and since user space needs
to be mapped in that area too, you effectively have the
requirement that
- user space virtual memory + kernel virtual memory = 4GB
Some other architectures effectively have separate virtual
memory spaces for user and kernel space, so they don't
necessarily need to be 4GB combined, they can be separately
4GB each. On x86, that isn't practical (you can emulate it
by switching page tables around a lot, and yes, it's been
done, but despite the fact that Intel makes TLB lookups
very fast, it's quite nasty for performance!)
And the thing is, user space really does want its own
virtual memory too. Under Linux, you can choose how to
split up the 4GB address space, but on the whole, you'd
actually prefer to give user space as much as possible,
since
- user space tends to use reasonably sparse virtual
address spaces (loadable libraries in one place, code,
data, stack in other areas etc)
- quite often, you have one "major" app, which wants to
use as much of the physical memory as possible, eg a
database, so you actually want user space to map as much
of the physical memory too!
So Linux defaults to giving user space 3GB, and just 1GB
for the kernel. That's basically the "user space is more
important in the end" version. I think WinXP does a 2:2GB
split.
So basically, the kernel really never has 4GB of virtual
memory. It has 1GB (possibly 2GB).
It also needs to map in PCI memory-mapped stuff etc, so in
practice, it's actually less than that. And then the kernel
will need some more virtual memory for its own use, eg
things like loading modules, and just for the window to do
physical addresses bigger than it can map directly.
So in practice, in Linux, when you had more than about
920MB (I forget the exact number), you ended up maxing out
what the kernel could map directly.
And once you max that out, think about what happens for
a moment...
Once you start having to access physical memory through
some kind of HIGHMEM.SYS window (which is what PAE is),
and cannot map it all, you can no longer keep normal
pointers to such memory around. Instead, you are basically
using a really ugly and strange segmented architecture,
where you keep some kind of indirect pointer, and every
time before you use it, you have to map it into the virtual
address space, access it, and then unmap it again.
So performance plummets, and the code actually gets really
nasty too, so you don't actually use the high memory for
any random data, you only use it for special stuff. As
an example, you'd use it for disk caching (that's what 90%
of all HIGHMEM.SYS usage was too - a lot of people just
set it all - or at least a big chunk of it - aside as a
harddisk cache, because so few programs could use it very
well for anything else).
Linux used it for more than disk caching, but the point
is, since it's special memory that you have to access
through a small window, or can just map into user space,
it's not generally usable any more.
And yes, there were serious problems. In theory, you
can have 64GB of RAM with Linux on a PAE x86 box. In
practice, it seldom worked very well past the 4GB mark,
so PAE itself was almost totally useless. The reason was
that once you had more than 4GB of memory, you usually
had filled out a large chunk of the easily accessible
memory with just all the data structures to keep track
of the rest of memory (that's exaggerated, but it's not
entirely off).
So you literally were in this situation where you had
tons of memory that could be used for some things,
but you ran out of memory for other basic bookkeeping
stuff.
So some people used 16GB of RAM, and by limiting their
working set to cases where the HIGHMEM stuff was fine,
and they didn't need a whole lot of other memory, they
made it work. But the code was (and is) grotty and pretty
disgusting, and performs much worse than just allowing
anything to use any of the RAM.
And the limit literally isn't the 4GB mark.
>So, in practical terms, is there benefit from using
>64-bit OS on a machine with, lets say, 2-3 GB RAM ?
Absolutely. No question what-so-ever. If you have 2GB
of RAM on a 64-bit architecture, you can access it all
easily in the kernel, and none of it is limited to just
some special use. And a single big process can also use
it effectively at the same time, so a database can
actually have it all mapped without having to play
windowing games in user space either.
Linus
Topic | Posted By | Date |
---|---|---|
Rock/Tukwila rumors | mas | 2007/05/05 11:59 AM |
Rock/Tukwila rumors | David Kanter | 2007/05/05 01:33 PM |
Rock/Tukwila rumors | Dean Kent | 2007/05/05 02:35 PM |
K8 vs Win64 timeline | anonymous | 2007/05/05 05:19 PM |
Yes, I misremembered... | Dean Kent | 2007/05/05 09:03 PM |
Rock | Daniel Bizó | 2007/05/06 01:34 AM |
Rock | Dean Kent | 2007/05/06 06:11 AM |
Rock/Tukwila rumors | Joe | 2007/05/06 10:24 AM |
Rock/Tukwila rumors | Dean Kent | 2007/05/06 10:49 AM |
Rock/Tukwila rumors | Linus Torvalds | 2007/05/06 11:09 AM |
Rock/Tukwila rumors | anon | 2007/05/07 12:32 AM |
Rock/Tukwila rumors | Rakesh Malik | 2007/05/07 08:36 AM |
Rock/Tukwila rumors | Michael S | 2007/05/07 09:06 AM |
Rock/Tukwila rumors | anon | 2007/05/07 08:48 PM |
Rock/Tukwila rumors | Rakesh Malik | 2007/05/08 05:45 AM |
Rock/Tukwila rumors | anon | 2007/05/08 04:30 PM |
Wow. (nt) | Brannon | 2007/05/08 05:16 PM |
Rock/Tukwila rumors | rwessel | 2007/05/08 08:48 PM |
Rock/Tukwila rumors | JS | 2007/05/08 09:07 PM |
Rock/Tukwila rumors | JS | 2007/05/09 05:44 AM |
Rock/Tukwila rumors | Rakesh Malik | 2007/05/09 04:35 AM |
Much ado about x | Michael S | 2007/05/09 08:39 AM |
Call it x86-64 | Linus Torvalds | 2007/05/09 09:27 AM |
(i)AMD64 | Michael S | 2007/05/09 11:16 AM |
(i)AMD64 | Linus Torvalds | 2007/05/09 11:29 AM |
(i)AMD64 | Groo | 2007/05/09 03:45 PM |
TIFNAA | anonymous | 2007/05/09 04:49 PM |
Inspired by FYR Macedonia? (NT) | Michael S | 2007/05/09 10:21 PM |
More likely... | rwessel | 2007/05/09 11:39 PM |
TIFNAA | Gabriele Svelto | 2007/05/09 10:57 PM |
(i)AMD64 | James | 2007/05/10 01:27 AM |
i86 | Dean Kent | 2007/05/09 11:30 AM |
(i)AMD64 | Max | 2007/05/09 12:28 PM |
wide86? long86? | hobold | 2007/05/10 04:05 AM |
x87 perhaps, it is one more. :) (NT) | Groo | 2007/05/10 04:50 AM |
x86+ | Dean Kent | 2007/05/10 07:44 AM |
Does it really matter? | Doug Siebert | 2007/05/10 08:10 AM |
let's stay with x86-64 for now, please | Marcin Niewiadomski | 2007/05/10 10:50 AM |
let's stay with x86-64 for now, please | Dean Kent | 2007/05/11 05:11 AM |
let's stay with x86-64 for now, please | rwessel | 2007/05/11 01:46 PM |
let's stay with x86-64 for now, please | Dean Kent | 2007/05/11 05:03 PM |
let's stay with x86-64 for now, please | Michael S | 2007/05/12 09:49 AM |
let's stay with x86-64 for now, please | Dean Kent | 2007/05/12 12:05 PM |
let's stay with x86-64 for now, please | Michael S | 2007/05/12 12:25 PM |
let's stay with x86-64 for now, please | Dean Kent | 2007/05/12 02:39 PM |
let's stay with x86-64 for now, please | JasonB | 2007/05/13 06:43 AM |
client consolidation | Michael S | 2007/05/13 07:37 AM |
let's stay with x86-64 for now, please | Tzvetan Mikov | 2007/05/13 02:44 PM |
let's stay with x86-64 for now, please | rwessel | 2007/05/14 01:42 PM |
What's your point? | Doug Siebert | 2007/05/11 01:56 PM |
What's your point? | Linus Torvalds | 2007/05/11 03:15 PM |
What's your point? | Doug Siebert | 2007/05/13 02:11 PM |
What's your point? | Dean Kent | 2007/05/13 06:04 PM |
What's your point? | JasonB | 2007/05/14 01:06 AM |
What's your point? | Dean Kent | 2007/05/14 06:20 AM |
What's your point? | JasonB | 2007/05/14 03:35 PM |
What's your point? | JasonB | 2007/05/14 06:35 PM |
What's your point? | Dean Kent | 2007/05/14 07:12 PM |
What's your point? | Dean Kent | 2007/05/11 05:06 PM |
What's your point? | Stephen H | 2007/05/13 12:55 AM |
Why didn't MS take advantage of PAE? | David W. Hess | 2007/05/13 07:37 AM |
PAE sucks (Why didn't MS take advantage of PAE?) | Linus Torvalds | 2007/05/13 09:20 AM |
PAE sucks (Why didn't MS take advantage of PAE?) | Dean Kent | 2007/05/13 09:49 AM |
PAE sucks (Why didn't MS take advantage of PAE?) | David W. Hess | 2007/05/13 11:37 AM |
> 1 GB RAM on a 32-bit system | Tzvetan Mikov | 2007/05/13 12:44 PM |
> 1 GB RAM on a 32-bit system | S. Rao | 2007/05/13 02:00 PM |
> 1 GB RAM on a 32-bit system | Tzvetan Mikov | 2007/05/13 04:32 PM |
> 1 GB RAM on a 32-bit system | S. Rao | 2007/05/13 11:19 PM |
> 1 GB RAM on a 32-bit system | Linus Torvalds | 2007/05/13 02:46 PM |
> 1 GB RAM on a 32-bit system | Tzvetan Mikov | 2007/05/13 04:23 PM |
> 1 GB RAM on a 32-bit system | JasonB | 2007/05/13 05:37 PM |
Windows manages memory differently | Tzvetan Mikov | 2007/05/13 07:31 PM |
Windows manages memory differently | JasonB | 2007/05/14 12:50 AM |
Windows manages memory differently | Tzvetan Mikov | 2007/05/14 07:56 AM |
Windows manages memory differently | rwessel | 2007/05/14 02:40 PM |
Windows manages memory differently | David W. Hess | 2007/05/14 03:07 PM |
Windows manages memory differently | rwessel | 2007/05/14 03:51 PM |
Windows manages memory differently | Tzvetan Mikov | 2007/05/14 04:40 PM |
Windows manages memory differently | rwessel | 2007/05/14 05:09 PM |
Windows manages memory differently | Howard Chu | 2007/05/14 10:17 AM |
Windows manages memory differently | Jukka Larja | 2007/05/14 10:30 AM |
Windows manages memory differently | Tzvetan Mikov | 2007/05/14 12:54 PM |
Windows manages memory differently | Howard Chu | 2007/05/15 02:35 AM |
Windows manages memory differently | Groo | 2007/05/15 06:34 AM |
Anyone know what OS X (10.4, Intel, desktop) does? | Matt Sayler | 2007/05/15 05:23 AM |
Anyone know what OS X (10.4, Intel, desktop) does? | Wes Felter | 2007/05/15 07:37 AM |
Anyone know what OS X (10.4, Intel, desktop) does? | Anonymous | 2007/05/15 09:49 AM |
Anyone know what OS X (10.4, Intel, desktop) does? | anon2 | 2007/05/15 06:13 PM |
PAE sucks (Why didn't MS take advantage of PAE?) | Paul | 2007/05/13 02:40 PM |
PAE sucks (Why didn't MS take advantage of PAE?) | Peter Arremann | 2007/05/13 04:38 PM |
PAE sucks (Why didn't MS take advantage of PAE?) | Henrik S | 2007/05/14 02:31 AM |
The fragility of your argument | slacker | 2007/05/13 02:56 PM |
The fragility of your argument | nick | 2007/05/13 04:42 PM |
The fragility of your argument | Howard Chu | 2007/05/14 01:52 AM |
The fragility of your argument | Dean Kent | 2007/05/14 08:19 AM |
The fragility of your argument | anon2 | 2007/05/14 07:26 AM |
The fragility of your argument | Tzvetan Mikov | 2007/05/14 08:01 AM |
The fragility of your argument | Dean Kent | 2007/05/14 08:16 AM |
The fragility of your argument | Linus Torvalds | 2007/05/14 10:57 AM |
The fragility of your argument | JasonB | 2007/05/14 03:48 PM |
The fragility of your argument | Dean Kent | 2007/05/14 06:36 PM |
The fragility of your argument | Ricardo B | 2007/05/16 01:40 AM |
The fragility of your argument | Dean Kent | 2007/05/16 02:32 AM |
The fragility of your argument | Ricardo B | 2007/05/16 05:41 AM |
PS | Ricardo B | 2007/05/16 05:50 AM |
The fragility of your argument | Dean Kent | 2007/05/16 08:07 AM |
Modern web browsing | S. Rao | 2007/05/16 08:16 AM |
Aha! | Dean Kent | 2007/05/16 08:27 AM |
Aha! | Dean Kent | 2007/05/16 08:32 AM |
Aha! | S. Rao | 2007/05/16 09:34 AM |
The fragility of your argument | Ricardo B | 2007/05/16 09:00 AM |
The fragility of your argument | Vincent Diepeveen | 2007/05/16 09:10 AM |
The fragility of your argument | Paul | 2007/05/16 02:01 PM |
The fragility of your argument | Vincent Diepeveen | 2007/05/17 02:05 AM |
The fragility of your argument | anon2 | 2007/05/15 12:35 AM |
Splits vs page allocations? | Matt Sayler | 2007/05/15 06:33 AM |
What's your point? | Michael S | 2007/05/13 07:55 AM |
What's your point? | anonymous | 2007/05/13 10:08 AM |
What's your point? | Michael S | 2007/05/13 10:31 AM |
let's stay with x86-64 for now, please | JasonB | 2007/05/13 06:16 AM |
x864 =) (NT) | some1 | 2007/05/15 02:03 AM |
Rock/Tukwila rumors | IntelUser2000 | 2007/05/06 01:27 PM |
Rock/Tukwila rumors | m | 2007/05/13 07:05 AM |
Rock/Tukwila rumors | mas | 2007/05/15 08:40 AM |