Hybrid kernel, not NT

By: nick (anon.delete@this.anon.com), May 16, 2006 2:53 am
Room: Moderated Discussions
Brendan (btrotter@gmail.com) on 5/16/06 wrote:
>nick (anon@anon.com) on 5/15/06 wrote:

>>Do you use threads of a single memory space running on
>>different nodes?
>Yes, but I split user space into "process space" and "thread space", such that
>thread space can't be accessed from other threads. It's a little like "thread local
>data" in POSIX, only implemented so that seperation is enforced. The disadvantage
>is that switching between threads that belong to the same process involves changing
>address spaces and is as expensive as switching between
processes. The advantages
>are that security of the threads local data is enforced, a thread's data doesn't
>suffer from cacheline bouncing (or "across NUMA node" access penalties if the process
>itself isn't tied to a specific NUMA node), the linear

So you migrate the page when the thread moves across CPUs?
You still have program text and the wider memory space
bouncing though.

AFAIK there has been work in the past on Linux (and probably
other OSes) to do things like NUMA page migration, and even
pagecache NUMA replication (for unmapped or readonly mapped
pages). Not sure the state of these.

>>Wrong. Number of cores has nothing to do with it, and
>>desktops/workstations/small servers will never care much
>>about NUMA issues because there just aren't enough sockets
>>to make a difference. Improvement on even an 8 socket
>>Opteron is probably unmeasurable on Linux, for example.
>For Opteron one hop is about 25% slower and 2 hops is about 50% slower. I couldn't
>find figures for 3 hops (which is necessary for 8 sockets when there's only 3 hypertransport
>links and something needs to connect to an I/O hub), and the figures I did find
>vary a fair bit between different sources.

I didn't mean memory latency, obviously that is easily
measurable and relevant to real workloads. I was talking
about kernel text replication. icache is mostly very well
behaved (readonly, good locality, high frequency of use)
and pretty easy to prefetch.

>>The systems I'm talking about have local/remote latency
>>ratios of 10:1, and going from one end of the interconnect
>>to the other takes ~8 router hops over probably 20 or more
>>*Those* guys are just starting to care about it a little
>>bit. And not so much because the slowdown is noticable for
>>the nodes taking icache faults from remote memory, but
>>because the combined effect of all of them saturates node0's
>For Linux (IIRC) the kernel is loaded into the first 16 MB of physical memory which
>usually corresponds to node0. In this case node0 would have to cope with all accesses
>to the kernel's code and data (including device driver's, locks, etc) plus the traffic
>going to/from the first I/O controller. It would be a complete disaster - simply
>loading the kernel into the highest physical pages would alleviate the pressure
>on node0 (and possibly shift the pressure to a different node, although traffic
>to/from the first I/O controller hub wouldn't add to the pressure in this case).
>Unfortunately, I don't know enough about Linux's NUMA support and might be completely wrong.

Actually they run 512CPU machines in production and it
doesn't appear to be a disaster at all. They do have CPUs
with 9MB cache, so obviously it will hurt a lot more with
Opteron sized caches... but instructions usually aren't a
huge bandwidth hog.

Hence my assertion that you're probably not going to be
able to measure it on a 4 node opteron.

>In general, fixing the worst performance problem tends to give you a new worst
>performance problem (that isn't quite as bad as the old worst performance problem).
>If you fix the pressure on node0 somehow, what would be the new worst performance problem for large NUMA systems?

Obviously it will vary a lot depending on workload and
system. But the node0 pressure thing has already been mostly
fixed by doing things like distributing data pages (hashes,
etc) across nodes or affine to their relevant device.

As I said, the kernel text thing seems to not be a problem
(based on the fact that it is relatively simple yet SGI
don't use it yet).

>>So how do you know you've done it right? Are you designing
>>based on assumptions, or real testing? Is this work public?
>Assumption to begin with, but for each prototype I examine/test it and find "problem
>areas", and then try to avoid the problems in later prototypes. Of course at this
>stage I'm only really trying to avoid design problems - implementation problems
>can be fixed later. The kernel is closed source freeware, while the rest will be
>a mixture (open source where possible). There is a web-site for it but the project isn't "interesting" yet.


Well good luck with it. Sounds fun -- post a link if/when
you feel it is a bit more interesting.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Hybrid (micro)kernelsTzvetan Mikov2006/05/08 04:41 PM
  Hybrid (micro)kernelsS. Rao2006/05/08 06:14 PM
  Hybrid (micro)kernelsBill Todd2006/05/08 06:16 PM
    Hybrid (micro)kernelsTzvetan Mikov2006/05/08 07:21 PM
      Hybrid (micro)kernelsnick2006/05/08 07:50 PM
      Hybrid (micro)kernelsBill Todd2006/05/09 01:26 AM
        There aren't enough words...Rob Thorpe2006/05/09 02:39 AM
          There aren't enough words...Tzvetan Mikov2006/05/09 03:10 PM
            There aren't enough words...Rob Thorpe2006/05/15 12:25 AM
        Hybrid (micro)kernelsTzvetan Mikov2006/05/09 11:17 AM
          Hybrid (micro)kernelsBill Todd2006/05/09 04:05 PM
  Hybrid (micro)kernelsrwessel2006/05/08 11:23 PM
    Hybrid kernel, not NTRichard Urich2006/05/09 06:03 AM
      Hybrid kernel, not NT_Arthur2006/05/09 07:06 AM
        Hybrid kernel, not NTRob Thorpe2006/05/09 07:40 AM
          Hybrid kernel, not NT_Arthur2006/05/09 08:30 AM
            Hybrid kernel, not NTRob Thorpe2006/05/09 09:07 AM
              Hybrid kernel, not NT_Arthur2006/05/09 09:36 AM
                Linux vs MacOSX peformance, debunked_Arthur2006/05/18 07:30 AM
                  Linux vs MacOSX peformance, debunkedRob Thorpe2006/05/18 08:19 AM
                    Linux vs MacOSX peformance, debunkedAnonymous2006/05/18 12:31 PM
        Hybrid kernel, not NTLinus Torvalds2006/05/09 08:16 AM
          Hybrid kernel, not NTAndi Kleen2006/05/09 02:32 PM
            Hybrid kernel, not NTmyself2006/05/09 03:24 PM
              Hybrid kernel, not NTmyself2006/05/09 03:41 PM
              Hybrid kernel, not NTBrendan2006/05/09 05:26 PM
                Hybrid kernel, not NTLinus Torvalds2006/05/09 08:06 PM
                  Hybrid kernel, not NTBrendan2006/05/13 01:35 AM
                    Hybrid kernel, not NTnick2006/05/13 04:40 AM
                      Hybrid kernel, not NTBrendan2006/05/13 09:48 AM
                        Hybrid kernel, not NTnick2006/05/13 07:41 PM
                          Hybrid kernel, not NTBrendan2006/05/13 09:51 PM
                            Hybrid kernel, not NTnick2006/05/14 05:57 PM
                              Hybrid kernel, not NTBrendan2006/05/14 10:40 PM
                                Hybrid kernel, not NTnick2006/05/14 11:46 PM
                                  Hybrid kernel, not NTBrendan2006/05/15 04:00 AM
                                    Hybrid kernel, not NTrwessel2006/05/15 07:21 AM
                                      Hybrid kernel, not NTBrendan2006/05/15 08:55 AM
                                        Hybrid kernel, not NTLinus Torvalds2006/05/15 09:49 AM
                                          Hybrid kernel, not NTnick2006/05/15 04:41 PM
                                          Hybrid kernel, not NTtony roth2008/01/31 02:20 PM
                                    Hybrid kernel, not NTnick2006/05/15 06:33 PM
                                      Hybrid kernel, not NTBrendan2006/05/16 01:39 AM
                                        Hybrid kernel, not NTnick2006/05/16 02:53 AM
                                          Hybrid kernel, not NTBrendan2006/05/16 05:37 AM
                  Hybrid kernel, not NTAnonymous2008/05/01 10:31 PM
                    Following the structure of the treeMichael S2008/05/02 04:19 AM
                      Following the structure of the treeDean Kent2008/05/02 05:31 AM
                        Following the structure of the treeMichael S2008/05/02 06:02 AM
                        Following the structure of the treeDavid W. Hess2008/05/02 06:48 AM
                          Following the structure of the treeDean Kent2008/05/02 09:14 AM
                            Following the structure of the treeDavid W. Hess2008/05/02 10:05 AM
                              LOL!Dean Kent2008/05/02 10:33 AM
                              Following the structure of the treeanonymous2008/05/02 03:04 PM
                                Following the structure of the treeDean Kent2008/05/02 07:52 PM
                                Following the structure of the treeFoo_2008/05/03 02:01 AM
                                  Following the structure of the treeDavid W. Hess2008/05/03 06:54 AM
                                    Following the structure of the treeDean Kent2008/05/03 10:06 AM
                                      Following the structure of the treeFoo_2008/05/04 01:06 AM
                                        Following the structure of the treeMichael S2008/05/04 01:22 AM
            Hybrid kernel, not NTLinus Torvalds2006/05/09 05:19 PM
              Microkernel Vs Monolithic KernelKernel_Protector2006/05/09 09:41 PM
                Microkernel Vs Monolithic KernelDavid Kanter2006/05/09 10:30 PM
                  Sigh, Stand back, its slashdotting time. (NT)Anonymous2006/05/09 10:44 PM
                  Microkernel Vs Monolithic Kernelblah2006/05/12 08:58 PM
                  Microkernel Vs Monolithic KernelRob Thorpe2006/05/15 01:41 AM
          Hybrid kernel, not NTAnalGuy2006/05/16 03:10 AM
            Theory versus practiceDavid Kanter2006/05/16 12:55 PM
              Distributed algorithmsRob Thorpe2006/05/17 12:53 AM
              Theory versus practiceHoward Chu2006/05/17 02:54 AM
                Theory versus practiceJS2006/05/17 04:29 AM
          Play online poker, blackjack !!! Gamezonex2007/08/16 01:49 PM
          Hybrid kernel, not NT (NT)atle rene mossik2020/12/12 09:31 AM
  Hybrid (micro)kernelsphilt2006/05/14 09:15 PM
    Hybrid (micro)kernelsLinus Torvalds2006/05/15 08:20 AM
      Hybrid (micro)kernelsLinus Torvalds2006/05/15 11:56 AM
        Hybrid (micro)kernelsRob Thorpe2006/05/16 01:22 AM
          Hybrid (micro)kernelsrwessel2006/05/16 11:23 AM
            Hybrid (micro)kernelsRob Thorpe2006/05/17 12:43 AM
              Hybrid (micro)kernelsrwessel2006/05/17 01:33 AM
                Hybrid (micro)kernelsRob Thorpe2006/05/19 07:51 AM
                  Hybrid (micro)kernelsrwessel2006/05/19 12:27 PM
      Hybrid (micro)kernelstechIperson2006/05/15 01:25 PM
      Hybrid (micro)kernelsmas2006/05/15 05:17 PM
        Hybrid (micro)kernelsLinus Torvalds2006/05/15 05:39 PM
          Hybrid (micro)kernelsColonel Kernel2006/05/15 09:17 PM
            Hybrid (micro)kernelsWink Saville2006/05/15 10:31 PM
              Hybrid (micro)kernelsLinus Torvalds2006/05/16 10:08 AM
                Hybrid (micro)kernelsWink Saville2006/05/16 09:55 PM
          Hybrid (micro)kernelsrwessel2006/05/16 11:31 AM
            Hybrid (micro)kernelsLinus Torvalds2006/05/16 12:00 PM
        Hybrid (micro)kernelsBrendan2006/05/16 01:36 AM
        Hybrid (micro)kernelsPaul Elliott2006/09/03 08:44 AM
          Hybrid (micro)kernelsRob Thorpe2006/09/04 09:25 AM
      Hybrid (micro)kernelsphilt2006/05/16 12:55 AM
        Hybrid (micro)kernelspgerassi2007/08/16 07:41 PM
  Another questionable entry on Wikipedia?Chung Leong2006/05/18 10:33 AM
  Hybrid (micro)kernelsisrael2006/05/20 04:25 AM
    Hybrid (micro)kernelsRob Thorpe2006/05/22 08:35 AM
Reply to this Topic
Body: No Text
How do you spell avocado?