By: Rob Thorpe (rthorpe.delete@this.realworldtech.com), October 26, 2006 10:45 am
Room: Moderated Discussions
Gabriele Svelto (gabriele.svelto@gmail.com) on 10/26/06 wrote:
---------------------------
>Linus Torvalds (torvalds@osdl.org) on 10/25/06 wrote:
>---------------------------
>>Well, I don't think "safe languages" per se is the
>>problem. "Strange threaded languages with certain models
>>of memory ordering, apparently including Java" - yes.
>
>No, the problems relies in the 'safety' part of the language. Actually the fact
>that Java has a memory model is a Good Thing (tm). Not having it means that you
>are not really sure how to handle corner cases, one of the extremes being C where
>there is no language support for threads (or almost none in C99). What optimization
>are legal around locks and the like are hard to discern as they are in the realm
>of unspecified - or plain wrong - behaviour. The problem with Java is that the safety
>of the language relies on the fact that code will never ever touch data which has
>not been properly initialized. The application code cannot mess data as it would
>be unverifiable and the VM must ensure that whathever it passes to the user has
>been properly initialized and that's were the barrier rolls in since the language
>is multi-threaded and you must ensure consistency.
Yes.
>>The kernel doesn't do a lot of lockless stuff, so it's
>>all generally ok. Almost always, when you actually expose
>>a piece of data structure globally, you need to lock the
>>data structure that you expose it in, so you do
>>have locking between "exposer" and "the rest of the world".
>>
>>Quite frankly, I don't understand how you can avoid it
>>either. Your example with "GlobalPtr" didn't make much
>>sense from a real example standpoint, since normally you'd
>>not have one global pointer, you'd have a hash-table or
>>linked list or somethign else that you expose new entries
>>throgh, and that data structure needs locking
>>anyway.
>
>That's not always true, and in fact it's a case that often arises in allocators.
>Think of a structure that may grow but never shrinks and you cannot ask for an element
>inside of it without asking a writer thread to add it. The writer thread could write
>to the structure atomically without locking, readers would also not need to lock
>the structure. You'd have a lock for example in the message queue used by the reader
>threads to ask for a write from the writer thread but not around the structure itself.
>A good example is a radix-tree mapping the pages used by the allocator, with the
>writer having the responsibility of adding pages to the radix tree and readers using
>them to get information on the pages.
>
>Now you wouldn't write an allocator in a managed language but the example still
>stands. If you have a very large shared structure with lots and lots of reads and
>little or no writes even a read-only lock can severy hamper scalability. Naturally
>avoiding locking also often means shooting yourself in the foot but still there are cases where you might consider it.
I think this and S.Rao's post show us the real cost in safe languages of a computer not ordering stores.
If the machine doesn't order stores then the language VM can:
* Put memory barriers around constructors where the objs could be used in another thread
* Check if the programmer hasn't mutexed the thing being constructed straight afterwards. If they have then remove the barriers since they'll be duplicated later.
So the pain being encountered is that in the case where no barrier is needed one must be introduced. I think this case must be fairly rare though. The above optimization though also means that being able to choose if any old variable is put into another thread dynamically at runtime is very difficult.
Probably the true pain is figuring this stuff out. I bet if the folks in this thread charged consultancy rates for posting in it then Tzvetan would have to remortage his house by now ;).
(I could be completely out-to-lunch though, since I'm not out-to-lunch and therefore hungry and not thinking straight.)
---------------------------
>Linus Torvalds (torvalds@osdl.org) on 10/25/06 wrote:
>---------------------------
>>Well, I don't think "safe languages" per se is the
>>problem. "Strange threaded languages with certain models
>>of memory ordering, apparently including Java" - yes.
>
>No, the problems relies in the 'safety' part of the language. Actually the fact
>that Java has a memory model is a Good Thing (tm). Not having it means that you
>are not really sure how to handle corner cases, one of the extremes being C where
>there is no language support for threads (or almost none in C99). What optimization
>are legal around locks and the like are hard to discern as they are in the realm
>of unspecified - or plain wrong - behaviour. The problem with Java is that the safety
>of the language relies on the fact that code will never ever touch data which has
>not been properly initialized. The application code cannot mess data as it would
>be unverifiable and the VM must ensure that whathever it passes to the user has
>been properly initialized and that's were the barrier rolls in since the language
>is multi-threaded and you must ensure consistency.
Yes.
>>The kernel doesn't do a lot of lockless stuff, so it's
>>all generally ok. Almost always, when you actually expose
>>a piece of data structure globally, you need to lock the
>>data structure that you expose it in, so you do
>>have locking between "exposer" and "the rest of the world".
>>
>>Quite frankly, I don't understand how you can avoid it
>>either. Your example with "GlobalPtr" didn't make much
>>sense from a real example standpoint, since normally you'd
>>not have one global pointer, you'd have a hash-table or
>>linked list or somethign else that you expose new entries
>>throgh, and that data structure needs locking
>>anyway.
>
>That's not always true, and in fact it's a case that often arises in allocators.
>Think of a structure that may grow but never shrinks and you cannot ask for an element
>inside of it without asking a writer thread to add it. The writer thread could write
>to the structure atomically without locking, readers would also not need to lock
>the structure. You'd have a lock for example in the message queue used by the reader
>threads to ask for a write from the writer thread but not around the structure itself.
>A good example is a radix-tree mapping the pages used by the allocator, with the
>writer having the responsibility of adding pages to the radix tree and readers using
>them to get information on the pages.
>
>Now you wouldn't write an allocator in a managed language but the example still
>stands. If you have a very large shared structure with lots and lots of reads and
>little or no writes even a read-only lock can severy hamper scalability. Naturally
>avoiding locking also often means shooting yourself in the foot but still there are cases where you might consider it.
I think this and S.Rao's post show us the real cost in safe languages of a computer not ordering stores.
If the machine doesn't order stores then the language VM can:
* Put memory barriers around constructors where the objs could be used in another thread
* Check if the programmer hasn't mutexed the thing being constructed straight afterwards. If they have then remove the barriers since they'll be duplicated later.
So the pain being encountered is that in the case where no barrier is needed one must be introduced. I think this case must be fairly rare though. The above optimization though also means that being able to choose if any old variable is put into another thread dynamically at runtime is very difficult.
Probably the true pain is figuring this stuff out. I bet if the folks in this thread charged consultancy rates for posting in it then Tzvetan would have to remortage his house by now ;).
(I could be completely out-to-lunch though, since I'm not out-to-lunch and therefore hungry and not thinking straight.)