By: Brendan (btrotter.delete@this.gmail.com), May 13, 2006 9:51 pm
Room: Moderated Discussions
Hi,
nick (anon@anon.com) on 5/13/06 wrote:
>Brendan (btrotter@gmail.com) on 5/13/06 wrote:
>>nick (anon@anon.com) on 5/13/06 wrote:
>
>>>Great now your scalability sucks, and you can do this with
>>>a monolithic kernel anyway as Linus points out.
>>
>>The message queues are only locked when messages are being added or removed. I
>>guess it does depend on how the messaging is done - synchronous messaging would probably suck for scalability.
>
>No, it would suck. You're about 10 years behind: lock
>contention doesn't hurt scalability any more, cacheline
>contention does.
Am I?
For my last prototype, each thread has a "thread control block" containing it's message queue lock (one queue per thread). While each lock wasn't alone in it's own 128 byte area (Intel's recommendation) the data around it was deliberately arranged so that only rarely accessed things are in the same area as the lock. The chance of 2 or more CPUs trying to read/write to any of the items in the same cache line as the lock (including the lock itself) is very small. This could have been improved (every lock in it's own 128 byte area), but the extra memory cost wasn't worth the negligable gain.
>>>>Also, a micro-kernel is often small enough that the equivelent of a "big kernel
>>>>lock" is practical, meaning you could have one lock in the entire system in addition
>>>>to any "processes specific" locks, and no other locks.
>>>
>>>I don't know what systems that thing would run on, but it
>>>wouldn't be any big ones, that's for sure.
>>
>>That depends what the OS is designed for and how fast the slowest kernel function is.
>
>The microkernel handles scheduler, interrupts, IPC, and
>queueing messages to servers? And it has a big kernel lock?
>Then scalability sucks, full stop.
If the OS is designed for UP and SMP with 4 or less CPUs, and if the slowest kernel function is relatively fast, then it may be worth sacrificing a tiny amount of performance for simplified locking.
Given that (IIRC) Linux itself used a "big kernel lock" and nothing else for version 2.0, and that this lock is still used (although for a lot less than it originally was), and that the slowest kernel function within Linux 2.0 would've been much longer, I don't think you're in a position to argue too much.
It's just one of many compromises - scalability vs. developer time in this case.
>>>>Another potential advantage is NUMA systems, where you can set the CPU affinity
>>>>on a device driver so that it's always run on a CPU that is "close" to the I/O controller
>>>>used by the device. This doesn't apply to things like
>>>
>>>No, this is not. Because if the data is coming from / going
>>>to somewhere off node, you must send it across nodes anyway.
>>>And now you *also* have control information (messages) going
>>>across nodes as well.
>>
>>For control information you've got a choice - access the control data across nodes,
>>or access the device driver code and it's local state across nodes. Unless everything
>>(control data, code and state) is on the same node it can't be "perfect" regardless
>>of what you do (for all kernel designs).
>
>That isn't a choice, that is a restriction!
It's a hardware restriction that can't be avoided, where software developers can choose a way to attempt to minimize the "inter-node" access penalties. I will assume that for Linux, you end up with penalties when accessing device driver code and the device driver's local state (and no penalties when accessing the control data).
Cheers,
Brendan
nick (anon@anon.com) on 5/13/06 wrote:
>Brendan (btrotter@gmail.com) on 5/13/06 wrote:
>>nick (anon@anon.com) on 5/13/06 wrote:
>
>>>Great now your scalability sucks, and you can do this with
>>>a monolithic kernel anyway as Linus points out.
>>
>>The message queues are only locked when messages are being added or removed. I
>>guess it does depend on how the messaging is done - synchronous messaging would probably suck for scalability.
>
>No, it would suck. You're about 10 years behind: lock
>contention doesn't hurt scalability any more, cacheline
>contention does.
Am I?
For my last prototype, each thread has a "thread control block" containing it's message queue lock (one queue per thread). While each lock wasn't alone in it's own 128 byte area (Intel's recommendation) the data around it was deliberately arranged so that only rarely accessed things are in the same area as the lock. The chance of 2 or more CPUs trying to read/write to any of the items in the same cache line as the lock (including the lock itself) is very small. This could have been improved (every lock in it's own 128 byte area), but the extra memory cost wasn't worth the negligable gain.
>>>>Also, a micro-kernel is often small enough that the equivelent of a "big kernel
>>>>lock" is practical, meaning you could have one lock in the entire system in addition
>>>>to any "processes specific" locks, and no other locks.
>>>
>>>I don't know what systems that thing would run on, but it
>>>wouldn't be any big ones, that's for sure.
>>
>>That depends what the OS is designed for and how fast the slowest kernel function is.
>
>The microkernel handles scheduler, interrupts, IPC, and
>queueing messages to servers? And it has a big kernel lock?
>Then scalability sucks, full stop.
If the OS is designed for UP and SMP with 4 or less CPUs, and if the slowest kernel function is relatively fast, then it may be worth sacrificing a tiny amount of performance for simplified locking.
Given that (IIRC) Linux itself used a "big kernel lock" and nothing else for version 2.0, and that this lock is still used (although for a lot less than it originally was), and that the slowest kernel function within Linux 2.0 would've been much longer, I don't think you're in a position to argue too much.
It's just one of many compromises - scalability vs. developer time in this case.
>>>>Another potential advantage is NUMA systems, where you can set the CPU affinity
>>>>on a device driver so that it's always run on a CPU that is "close" to the I/O controller
>>>>used by the device. This doesn't apply to things like
>>>
>>>No, this is not. Because if the data is coming from / going
>>>to somewhere off node, you must send it across nodes anyway.
>>>And now you *also* have control information (messages) going
>>>across nodes as well.
>>
>>For control information you've got a choice - access the control data across nodes,
>>or access the device driver code and it's local state across nodes. Unless everything
>>(control data, code and state) is on the same node it can't be "perfect" regardless
>>of what you do (for all kernel designs).
>
>That isn't a choice, that is a restriction!
It's a hardware restriction that can't be avoided, where software developers can choose a way to attempt to minimize the "inter-node" access penalties. I will assume that for Linux, you end up with penalties when accessing device driver code and the device driver's local state (and no penalties when accessing the control data).
Cheers,
Brendan
Topic | Posted By | Date |
---|---|---|
Hybrid (micro)kernels | Tzvetan Mikov | 2006/05/08 04:41 PM |
Hybrid (micro)kernels | S. Rao | 2006/05/08 06:14 PM |
Hybrid (micro)kernels | Bill Todd | 2006/05/08 06:16 PM |
Hybrid (micro)kernels | Tzvetan Mikov | 2006/05/08 07:21 PM |
Hybrid (micro)kernels | nick | 2006/05/08 07:50 PM |
Hybrid (micro)kernels | Bill Todd | 2006/05/09 01:26 AM |
There aren't enough words... | Rob Thorpe | 2006/05/09 02:39 AM |
There aren't enough words... | Tzvetan Mikov | 2006/05/09 03:10 PM |
There aren't enough words... | Rob Thorpe | 2006/05/15 12:25 AM |
Hybrid (micro)kernels | Tzvetan Mikov | 2006/05/09 11:17 AM |
Hybrid (micro)kernels | Bill Todd | 2006/05/09 04:05 PM |
Hybrid (micro)kernels | rwessel | 2006/05/08 11:23 PM |
Hybrid kernel, not NT | Richard Urich | 2006/05/09 06:03 AM |
Hybrid kernel, not NT | _Arthur | 2006/05/09 07:06 AM |
Hybrid kernel, not NT | Rob Thorpe | 2006/05/09 07:40 AM |
Hybrid kernel, not NT | _Arthur | 2006/05/09 08:30 AM |
Hybrid kernel, not NT | Rob Thorpe | 2006/05/09 09:07 AM |
Hybrid kernel, not NT | _Arthur | 2006/05/09 09:36 AM |
Linux vs MacOSX peformance, debunked | _Arthur | 2006/05/18 07:30 AM |
Linux vs MacOSX peformance, debunked | Rob Thorpe | 2006/05/18 08:19 AM |
Linux vs MacOSX peformance, debunked | Anonymous | 2006/05/18 12:31 PM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/09 08:16 AM |
Hybrid kernel, not NT | Andi Kleen | 2006/05/09 02:32 PM |
Hybrid kernel, not NT | myself | 2006/05/09 03:24 PM |
Hybrid kernel, not NT | myself | 2006/05/09 03:41 PM |
Hybrid kernel, not NT | Brendan | 2006/05/09 05:26 PM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/09 08:06 PM |
Hybrid kernel, not NT | Brendan | 2006/05/13 01:35 AM |
Hybrid kernel, not NT | nick | 2006/05/13 04:40 AM |
Hybrid kernel, not NT | Brendan | 2006/05/13 09:48 AM |
Hybrid kernel, not NT | nick | 2006/05/13 07:41 PM |
Hybrid kernel, not NT | Brendan | 2006/05/13 09:51 PM |
Hybrid kernel, not NT | nick | 2006/05/14 05:57 PM |
Hybrid kernel, not NT | Brendan | 2006/05/14 10:40 PM |
Hybrid kernel, not NT | nick | 2006/05/14 11:46 PM |
Hybrid kernel, not NT | Brendan | 2006/05/15 04:00 AM |
Hybrid kernel, not NT | rwessel | 2006/05/15 07:21 AM |
Hybrid kernel, not NT | Brendan | 2006/05/15 08:55 AM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/15 09:49 AM |
Hybrid kernel, not NT | nick | 2006/05/15 04:41 PM |
Hybrid kernel, not NT | tony roth | 2008/01/31 02:20 PM |
Hybrid kernel, not NT | nick | 2006/05/15 06:33 PM |
Hybrid kernel, not NT | Brendan | 2006/05/16 01:39 AM |
Hybrid kernel, not NT | nick | 2006/05/16 02:53 AM |
Hybrid kernel, not NT | Brendan | 2006/05/16 05:37 AM |
Hybrid kernel, not NT | Anonymous | 2008/05/01 10:31 PM |
Following the structure of the tree | Michael S | 2008/05/02 04:19 AM |
Following the structure of the tree | Dean Kent | 2008/05/02 05:31 AM |
Following the structure of the tree | Michael S | 2008/05/02 06:02 AM |
Following the structure of the tree | David W. Hess | 2008/05/02 06:48 AM |
Following the structure of the tree | Dean Kent | 2008/05/02 09:14 AM |
Following the structure of the tree | David W. Hess | 2008/05/02 10:05 AM |
LOL! | Dean Kent | 2008/05/02 10:33 AM |
Following the structure of the tree | anonymous | 2008/05/02 03:04 PM |
Following the structure of the tree | Dean Kent | 2008/05/02 07:52 PM |
Following the structure of the tree | Foo_ | 2008/05/03 02:01 AM |
Following the structure of the tree | David W. Hess | 2008/05/03 06:54 AM |
Following the structure of the tree | Dean Kent | 2008/05/03 10:06 AM |
Following the structure of the tree | Foo_ | 2008/05/04 01:06 AM |
Following the structure of the tree | Michael S | 2008/05/04 01:22 AM |
Hybrid kernel, not NT | Linus Torvalds | 2006/05/09 05:19 PM |
Microkernel Vs Monolithic Kernel | Kernel_Protector | 2006/05/09 09:41 PM |
Microkernel Vs Monolithic Kernel | David Kanter | 2006/05/09 10:30 PM |
Sigh, Stand back, its slashdotting time. (NT) | Anonymous | 2006/05/09 10:44 PM |
Microkernel Vs Monolithic Kernel | blah | 2006/05/12 08:58 PM |
Microkernel Vs Monolithic Kernel | Rob Thorpe | 2006/05/15 01:41 AM |
Hybrid kernel, not NT | AnalGuy | 2006/05/16 03:10 AM |
Theory versus practice | David Kanter | 2006/05/16 12:55 PM |
Distributed algorithms | Rob Thorpe | 2006/05/17 12:53 AM |
Theory versus practice | Howard Chu | 2006/05/17 02:54 AM |
Theory versus practice | JS | 2006/05/17 04:29 AM |
Play online poker, blackjack !!! | Gamezonex | 2007/08/16 01:49 PM |
Hybrid kernel, not NT (NT) | atle rene mossik | 2020/12/12 09:31 AM |
Hybrid (micro)kernels | philt | 2006/05/14 09:15 PM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/15 08:20 AM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/15 11:56 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/16 01:22 AM |
Hybrid (micro)kernels | rwessel | 2006/05/16 11:23 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/17 12:43 AM |
Hybrid (micro)kernels | rwessel | 2006/05/17 01:33 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/19 07:51 AM |
Hybrid (micro)kernels | rwessel | 2006/05/19 12:27 PM |
Hybrid (micro)kernels | techIperson | 2006/05/15 01:25 PM |
Hybrid (micro)kernels | mas | 2006/05/15 05:17 PM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/15 05:39 PM |
Hybrid (micro)kernels | Colonel Kernel | 2006/05/15 09:17 PM |
Hybrid (micro)kernels | Wink Saville | 2006/05/15 10:31 PM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/16 10:08 AM |
Hybrid (micro)kernels | Wink Saville | 2006/05/16 09:55 PM |
Hybrid (micro)kernels | rwessel | 2006/05/16 11:31 AM |
Hybrid (micro)kernels | Linus Torvalds | 2006/05/16 12:00 PM |
Hybrid (micro)kernels | Brendan | 2006/05/16 01:36 AM |
Hybrid (micro)kernels | Paul Elliott | 2006/09/03 08:44 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/09/04 09:25 AM |
Hybrid (micro)kernels | philt | 2006/05/16 12:55 AM |
Hybrid (micro)kernels | pgerassi | 2007/08/16 07:41 PM |
Another questionable entry on Wikipedia? | Chung Leong | 2006/05/18 10:33 AM |
Hybrid (micro)kernels | israel | 2006/05/20 04:25 AM |
Hybrid (micro)kernels | Rob Thorpe | 2006/05/22 08:35 AM |