By: nick (anon.delete@this.anon.com), May 15, 2006 5:33 pm
Room: Moderated Discussions
Brendan (btrotter@gmail.com) on 5/15/06 wrote:
---------------------------
>>However, your 1% system may need to enter the kernel much
>>more *frequently*, in which case the cacheline contention
>>might be as bad or worse.
>
>I'd suggest that it doesn't matter if that 1% is caused by entering the kernel
>10 times (at 0.1% per time) or if it's caused by entering the kernel twice (at .5%
>per time) - it's still 1% regardless of how frequently you enter the kernel.
Not if each time you enter the kernel you send a message.
Most of the cache misses won't get eaten by the kernel, but
by the server which is reading the messages off the queue.
>
>What you might be trying to say is that the CPU might actually spend 2% of it's
>time in kernel space because the kernel is entered more often, in which case you'd
No.
>be right (but I don't see how this makes a fundamental difference). In general,
>if the time spent inside a micro-kernel is equal to the amount of time spent inside
>a monolithic kernel, then either the micro-kernel designer needs to be shot (or
>the monolithic kernel designer deserves several awards for outstanding acheivements).
>
Of course, but the micro-kernel is obviously directly
related to the time spent in the servers too (eg. the
number of cache misses they take); and considering ukernels
are no faster than monolithic kernels, the combination of
kernel + servers is going to be at least as much as a
monolithic kernel.
>>Well are you replecating the text of your servers?
>
>No - they are independant processes that use CPU affinity to ensure that they are always run on the same NUMA domain...
>
So you don't use shared libraries, or share program text?
OK, now you've lost the same amount of memory as Linux with
text replication.
Do you use threads of a single memory space running on
different nodes?
>>Considering that nobody in Linux even cares that much
>>about it except the guys with 1024 CPU systems, I'm
>>guessing it is completely unmeasurable on your kernel
>>(outside microbenchmarks, maybe). :-)
>
>Given that both AMD and Intel are increasing the number of cores rather than increasing
>core frequency (and that I predict Intel will be shifting to something like hypertransport/NUMA
>in the near future), the number of people who care about it is probably going to
>increase a lot by the time it matters to me.
>
Wrong. Number of cores has nothing to do with it, and
desktops/workstations/small servers will never care much
about NUMA issues because there just aren't enough sockets
to make a difference. Improvement on even an 8 socket
Opteron is probably unmeasurable on Linux, for example.
The systems I'm talking about have local/remote latency
ratios of 10:1, and going from one end of the interconnect
to the other takes ~8 router hops over probably 20 or more
meters.
*Those* guys are just starting to care about it a little
bit. And not so much because the slowdown is noticable for
the nodes taking icache faults from remote memory, but
because the combined effect of all of them saturates node0's
interconnect.
>My work consists of a series of prototypes, where each prototype builds on the
>last. The newest prototype uses a "modular micro-kernel", is 32 bit and 64 bit and
>is designed to scale to large NUMA systems. I've basically reached the end of the
>series of prototypes (there's nothing left to add and the worst of the bottlenecks
>are gone). With some luck, my current prototype will become the basis for an OS.
>I'm expecting it to take another 3 years before I've got a bare working system running
>on legacy hardware, but it's too different to port applications (or drivers) to
>it and it'll probably take 10 years or more before it's actually usable. I knew
>this before I started, which is why I've spent so much time making sure the kernel design is "right".
>
>Anyway, real world benchmarks (like comparing web server and database performance) is a long way off...
So how do you know you've done it right? Are you designing
based on assumptions, or real testing? Is this work public?
Kudos for trying, but it still doesn't sound convincing. K42
claims to be a microkernel, and occasionally they get really
excited about finding somewhere that Linux doesn't scale too
well at, and beat it. Which obviously turns out to be a
place that nobody ever cares about anyway.
>
>>I'd wager that passing messages across interconnect would
>>be more interesting...
>
>Passing messages between NUMA domains will be slower, but it's designed for passing
>messages across a LAN so I doubt NUMA domain boundaries are going to matter much in comparison.
Yeah, if it is so heavyweight that you don't notice these
cache misses, then doesn't sound like it is appropriate for
closely coupled NUMA interconnects.
---------------------------
>>However, your 1% system may need to enter the kernel much
>>more *frequently*, in which case the cacheline contention
>>might be as bad or worse.
>
>I'd suggest that it doesn't matter if that 1% is caused by entering the kernel
>10 times (at 0.1% per time) or if it's caused by entering the kernel twice (at .5%
>per time) - it's still 1% regardless of how frequently you enter the kernel.
Not if each time you enter the kernel you send a message.
Most of the cache misses won't get eaten by the kernel, but
by the server which is reading the messages off the queue.
>
>What you might be trying to say is that the CPU might actually spend 2% of it's
>time in kernel space because the kernel is entered more often, in which case you'd
No.
>be right (but I don't see how this makes a fundamental difference). In general,
>if the time spent inside a micro-kernel is equal to the amount of time spent inside
>a monolithic kernel, then either the micro-kernel designer needs to be shot (or
>the monolithic kernel designer deserves several awards for outstanding acheivements).
>
Of course, but the micro-kernel is obviously directly
related to the time spent in the servers too (eg. the
number of cache misses they take); and considering ukernels
are no faster than monolithic kernels, the combination of
kernel + servers is going to be at least as much as a
monolithic kernel.
>>Well are you replecating the text of your servers?
>
>No - they are independant processes that use CPU affinity to ensure that they are always run on the same NUMA domain...
>
So you don't use shared libraries, or share program text?
OK, now you've lost the same amount of memory as Linux with
text replication.
Do you use threads of a single memory space running on
different nodes?
>>Considering that nobody in Linux even cares that much
>>about it except the guys with 1024 CPU systems, I'm
>>guessing it is completely unmeasurable on your kernel
>>(outside microbenchmarks, maybe). :-)
>
>Given that both AMD and Intel are increasing the number of cores rather than increasing
>core frequency (and that I predict Intel will be shifting to something like hypertransport/NUMA
>in the near future), the number of people who care about it is probably going to
>increase a lot by the time it matters to me.
>
Wrong. Number of cores has nothing to do with it, and
desktops/workstations/small servers will never care much
about NUMA issues because there just aren't enough sockets
to make a difference. Improvement on even an 8 socket
Opteron is probably unmeasurable on Linux, for example.
The systems I'm talking about have local/remote latency
ratios of 10:1, and going from one end of the interconnect
to the other takes ~8 router hops over probably 20 or more
meters.
*Those* guys are just starting to care about it a little
bit. And not so much because the slowdown is noticable for
the nodes taking icache faults from remote memory, but
because the combined effect of all of them saturates node0's
interconnect.
>My work consists of a series of prototypes, where each prototype builds on the
>last. The newest prototype uses a "modular micro-kernel", is 32 bit and 64 bit and
>is designed to scale to large NUMA systems. I've basically reached the end of the
>series of prototypes (there's nothing left to add and the worst of the bottlenecks
>are gone). With some luck, my current prototype will become the basis for an OS.
>I'm expecting it to take another 3 years before I've got a bare working system running
>on legacy hardware, but it's too different to port applications (or drivers) to
>it and it'll probably take 10 years or more before it's actually usable. I knew
>this before I started, which is why I've spent so much time making sure the kernel design is "right".
>
>Anyway, real world benchmarks (like comparing web server and database performance) is a long way off...
So how do you know you've done it right? Are you designing
based on assumptions, or real testing? Is this work public?
Kudos for trying, but it still doesn't sound convincing. K42
claims to be a microkernel, and occasionally they get really
excited about finding somewhere that Linux doesn't scale too
well at, and beat it. Which obviously turns out to be a
place that nobody ever cares about anyway.
>
>>I'd wager that passing messages across interconnect would
>>be more interesting...
>
>Passing messages between NUMA domains will be slower, but it's designed for passing
>messages across a LAN so I doubt NUMA domain boundaries are going to matter much in comparison.
Yeah, if it is so heavyweight that you don't notice these
cache misses, then doesn't sound like it is appropriate for
closely coupled NUMA interconnects.
Topic | Posted By | Date |
---|---|---|
Hybrid (micro)kernels | Tzvetan Mikov | 2006/05/08 03:41 PM |
Hybrid (micro)kernels | S. Rao | 2006/05/08 05:14 PM |
Hybrid (micro)kernels | Bill Todd | 2006/05/08 05:16 PM |
Hybrid (micro)kernels | Tzvetan Mikov | 2006/05/08 06:21 PM |
Hybrid (micro)kernels | nick | 2006/05/08 06:50 PM |
Hybrid (micro)kernels | Bill Todd | 2006/05/09 12:26 AM |
There aren't enough words... | Rob Thorpe | 2006/05/09 01:39 AM |
There aren't enough words... | Tzvetan Mikov | 2006/05/09 02:10 PM |
There aren't enough words... | Rob Thorpe | 2006/05/14 11:25 PM |
Hybrid (micro)kernels | Tzvetan Mikov | 2006/05/09 10:17 AM |
Hybrid (micro)kernels | Bill Todd | 2006/05/09 03:05 PM |
Hybrid (micro)kernels | rwessel | 2006/05/08 10:23 PM |
Hybrid kernel, not NT | Richard Urich | 2006/05/09 05:03 AM |
Hybrid kernel, not NT | _Arthur | 2006/05/09 06:06 AM |
Hybrid kernel, not NT | Rob Thorpe | 2006/05/09 06:40 AM |
Hybrid kernel, not NT | _Arthur | 2006/05/09 07:30 AM |
Hybrid kernel, not NT | Rob Thorpe | 2006/05/09 08:07 AM |
Hybrid kernel, not NT | _Arthur | 2006/05/09 08:36 AM |
Linux vs MacOSX peformance, debunked | _Arthur | 2006/05/18 06:30 AM |
Linux vs MacOSX peformance, debunked | Rob Thorpe | 2006/05/18 07:19 AM |
Linux vs MacOSX peformance, debunked | Anonymous | 2006/05/18 11:31 AM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/09 07:16 AM |
Hybrid kernel, not NT | Andi Kleen | 2006/05/09 01:32 PM |
Hybrid kernel, not NT | myself | 2006/05/09 02:24 PM |
Hybrid kernel, not NT | myself | 2006/05/09 02:41 PM |
Hybrid kernel, not NT | Brendan | 2006/05/09 04:26 PM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/09 07:06 PM |
Hybrid kernel, not NT | Brendan | 2006/05/13 12:35 AM |
Hybrid kernel, not NT | nick | 2006/05/13 03:40 AM |
Hybrid kernel, not NT | Brendan | 2006/05/13 08:48 AM |
Hybrid kernel, not NT | nick | 2006/05/13 06:41 PM |
Hybrid kernel, not NT | Brendan | 2006/05/13 08:51 PM |
Hybrid kernel, not NT | nick | 2006/05/14 04:57 PM |
Hybrid kernel, not NT | Brendan | 2006/05/14 09:40 PM |
Hybrid kernel, not NT | nick | 2006/05/14 10:46 PM |
Hybrid kernel, not NT | Brendan | 2006/05/15 03:00 AM |
Hybrid kernel, not NT | rwessel | 2006/05/15 06:21 AM |
Hybrid kernel, not NT | Brendan | 2006/05/15 07:55 AM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/15 08:49 AM |
Hybrid kernel, not NT | nick | 2006/05/15 03:41 PM |
Hybrid kernel, not NT | tony roth | 2008/01/31 01:20 PM |
Hybrid kernel, not NT | nick | 2006/05/15 05:33 PM |
Hybrid kernel, not NT | Brendan | 2006/05/16 12:39 AM |
Hybrid kernel, not NT | nick | 2006/05/16 01:53 AM |
Hybrid kernel, not NT | Brendan | 2006/05/16 04:37 AM |
Hybrid kernel, not NT | Anonymous | 2008/05/01 09:31 PM |
Following the structure of the tree | Michael S | 2008/05/02 03:19 AM |
Following the structure of the tree | Dean Kent | 2008/05/02 04:31 AM |
Following the structure of the tree | Michael S | 2008/05/02 05:02 AM |
Following the structure of the tree | David W. Hess | 2008/05/02 05:48 AM |
Following the structure of the tree | Dean Kent | 2008/05/02 08:14 AM |
Following the structure of the tree | David W. Hess | 2008/05/02 09:05 AM |
LOL! | Dean Kent | 2008/05/02 09:33 AM |
Following the structure of the tree | anonymous | 2008/05/02 02:04 PM |
Following the structure of the tree | Dean Kent | 2008/05/02 06:52 PM |
Following the structure of the tree | Foo_ | 2008/05/03 01:01 AM |
Following the structure of the tree | David W. Hess | 2008/05/03 05:54 AM |
Following the structure of the tree | Dean Kent | 2008/05/03 09:06 AM |
Following the structure of the tree | Foo_ | 2008/05/04 12:06 AM |
Following the structure of the tree | Michael S | 2008/05/04 12:22 AM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/09 04:19 PM |
Microkernel Vs Monolithic Kernel | Kernel_Protector | 2006/05/09 08:41 PM |
Microkernel Vs Monolithic Kernel | David Kanter | 2006/05/09 09:30 PM |
Sigh, Stand back, its slashdotting time. (NT) | Anonymous | 2006/05/09 09:44 PM |
Microkernel Vs Monolithic Kernel | blah | 2006/05/12 07:58 PM |
Microkernel Vs Monolithic Kernel | Rob Thorpe | 2006/05/15 12:41 AM |
Hybrid kernel, not NT | AnalGuy | 2006/05/16 02:10 AM |
Theory versus practice | David Kanter | 2006/05/16 11:55 AM |
Distributed algorithms | Rob Thorpe | 2006/05/16 11:53 PM |
Theory versus practice | Howard Chu | 2006/05/17 01:54 AM |
Theory versus practice | JS | 2006/05/17 03:29 AM |
Play online poker, blackjack !!! | Gamezonex | 2007/08/16 12:49 PM |
Hybrid kernel, not NT (NT) | atle rene mossik | 2020/12/12 08:31 AM |
Hybrid (micro)kernels | philt | 2006/05/14 08:15 PM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/15 07:20 AM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/15 10:56 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/16 12:22 AM |
Hybrid (micro)kernels | rwessel | 2006/05/16 10:23 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/16 11:43 PM |
Hybrid (micro)kernels | rwessel | 2006/05/17 12:33 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/19 06:51 AM |
Hybrid (micro)kernels | rwessel | 2006/05/19 11:27 AM |
Hybrid (micro)kernels | techIperson | 2006/05/15 12:25 PM |
Hybrid (micro)kernels | mas | 2006/05/15 04:17 PM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/15 04:39 PM |
Hybrid (micro)kernels | Colonel Kernel | 2006/05/15 08:17 PM |
Hybrid (micro)kernels | Wink Saville | 2006/05/15 09:31 PM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/16 09:08 AM |
Hybrid (micro)kernels | Wink Saville | 2006/05/16 08:55 PM |
Hybrid (micro)kernels | rwessel | 2006/05/16 10:31 AM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/16 11:00 AM |
Hybrid (micro)kernels | Brendan | 2006/05/16 12:36 AM |
Hybrid (micro)kernels | Paul Elliott | 2006/09/03 07:44 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/09/04 08:25 AM |
Hybrid (micro)kernels | philt | 2006/05/15 11:55 PM |
Hybrid (micro)kernels | pgerassi | 2007/08/16 06:41 PM |
Another questionable entry on Wikipedia? | Chung Leong | 2006/05/18 09:33 AM |
Hybrid (micro)kernels | israel | 2006/05/20 03:25 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/22 07:35 AM |