By: Tzvetan Mikov (tzvetanmi.delete@this.yahoo.com), October 26, 2006 8:17 am
Room: Moderated Discussions
Linus Torvalds () on 10/25/06 wrote:
---------------------------
>[...]
>There was this whole other discussion a few months ago
>about memory ordering, where I was pointing out that the
>weaker memory ordering of alpha was actually much worse
>than x86 in many situations. This is one of them. You need
>to either use strict locking, or totally insane amounts
>of inappropriate memory ordering insutrctions.
While that was mostly theoretical then, I think the point is really sinking in now :-)
I think the conclusion of that discussion was that weak consistency ends up being worse than both total store ordering and release consistency, with release consistency most preferable among all.
This issue here seems somewhat orthogonal - that it is preferable to have an implicit barrier between dependent reads. I wonder is the implicit barrier cheaper than an explicit one, and why ?
>Well, I don't think "safe languages" per se is the
>problem. "Strange threaded languages with certain models
>of memory ordering, apparently including Java" - yes.
Not really. Java is actually one of the few languages that properly address threading and memory consistency in a way that is complete, intuitive and logical. As far as I can tell, any safe language that is in wide use today, including all .NET languages, Perl, Python, Lisp, ML, would have exactly the same problem on Alpha. (As far as I know, Perl, Python & friends use a global lock arround all data accesses - which is of course much worse than having read barriers)
As Gabriele explained, this is fundamental to language safety. To avoid this problem we need a language where all sharing of data between threads is explicitly controlled and known - I don't know whether that is plausible in an imperative language and how efficient it would be.
We might argue that safe languages are not that useful and we shouldn't worry about them (or design processors suitable for them). This opinion is certainly not uncommon. But since apparently only Alpha exhibits this particular problem and it is dead, we are not at such a crossroad yet.
>[...]
>Quite frankly, I don't understand how you can avoid it
>either. Your example with "GlobalPtr" didn't make much
>sense from a real example standpoint, since normally you'd
>not have one global pointer, you'd have a hash-table or
>linked list or somethign else that you expose new entries
>throgh, and that data structure needs locking
>anyway.
Well, that code is actually very real and widely used in Java. The typical scenario is when the result of a function is immutable but relatively expensive to compute, so it is cached the first time it is used.
In many cases this is done correctly by qualifying the global pointer with "volatile" (which in Java actually means something :-). This is reasonable and efficient and is a specific pattern in which you really don't need locking. (There is a small chance that more than one object will be created, but that is OK and the extra objects will be safely discarded)
In other cases the volatile is missing, which is a bug of course, but we are worried about those cases only so they don't crash the VM.
>How about just catching the SIGSEGV and telling the user
>that he messed up instead? The memory barrier is free on
>x86 (one of the reasons the x86 memory model is actually
>very nice), but on other architectures even the write
>barrier may well be fairly expensive, and it sounds like
>you're protecting against something that isn't really even
>worth protecting against.
The idea about SIGSEGV is a very interesing one. In an earlier message it was suggested to use that to recover from the problem. That seems a bit "hairy"to me, but not comletely impossible.
The alternative is, as you say, to just report the error to the user. The problem is that we don't just want to crash the application. We want such errors to have as isolated effect as possible and to be handled in a clean way. The application should be able to log the error and continue without compromising the integrity of the VM.
I will think about this.
>If you take a performance hit, at least it should be
>worth taking a hit over..
Having a predictable environment that never crashes and is always completely diagnosable is a worthy and interesting goal. It is not that expensive too, considering that many popular safe language in use today (Perl, Ruby, Python) are an order of magnitude slower than Java and C#.
Ultimately we might disagree on the usefulness of safe languages, but take it as an intellectual exercise :-)
---------------------------
>[...]
>There was this whole other discussion a few months ago
>about memory ordering, where I was pointing out that the
>weaker memory ordering of alpha was actually much worse
>than x86 in many situations. This is one of them. You need
>to either use strict locking, or totally insane amounts
>of inappropriate memory ordering insutrctions.
While that was mostly theoretical then, I think the point is really sinking in now :-)
I think the conclusion of that discussion was that weak consistency ends up being worse than both total store ordering and release consistency, with release consistency most preferable among all.
This issue here seems somewhat orthogonal - that it is preferable to have an implicit barrier between dependent reads. I wonder is the implicit barrier cheaper than an explicit one, and why ?
>Well, I don't think "safe languages" per se is the
>problem. "Strange threaded languages with certain models
>of memory ordering, apparently including Java" - yes.
Not really. Java is actually one of the few languages that properly address threading and memory consistency in a way that is complete, intuitive and logical. As far as I can tell, any safe language that is in wide use today, including all .NET languages, Perl, Python, Lisp, ML, would have exactly the same problem on Alpha. (As far as I know, Perl, Python & friends use a global lock arround all data accesses - which is of course much worse than having read barriers)
As Gabriele explained, this is fundamental to language safety. To avoid this problem we need a language where all sharing of data between threads is explicitly controlled and known - I don't know whether that is plausible in an imperative language and how efficient it would be.
We might argue that safe languages are not that useful and we shouldn't worry about them (or design processors suitable for them). This opinion is certainly not uncommon. But since apparently only Alpha exhibits this particular problem and it is dead, we are not at such a crossroad yet.
>[...]
>Quite frankly, I don't understand how you can avoid it
>either. Your example with "GlobalPtr" didn't make much
>sense from a real example standpoint, since normally you'd
>not have one global pointer, you'd have a hash-table or
>linked list or somethign else that you expose new entries
>throgh, and that data structure needs locking
>anyway.
Well, that code is actually very real and widely used in Java. The typical scenario is when the result of a function is immutable but relatively expensive to compute, so it is cached the first time it is used.
In many cases this is done correctly by qualifying the global pointer with "volatile" (which in Java actually means something :-). This is reasonable and efficient and is a specific pattern in which you really don't need locking. (There is a small chance that more than one object will be created, but that is OK and the extra objects will be safely discarded)
In other cases the volatile is missing, which is a bug of course, but we are worried about those cases only so they don't crash the VM.
>How about just catching the SIGSEGV and telling the user
>that he messed up instead? The memory barrier is free on
>x86 (one of the reasons the x86 memory model is actually
>very nice), but on other architectures even the write
>barrier may well be fairly expensive, and it sounds like
>you're protecting against something that isn't really even
>worth protecting against.
The idea about SIGSEGV is a very interesing one. In an earlier message it was suggested to use that to recover from the problem. That seems a bit "hairy"to me, but not comletely impossible.
The alternative is, as you say, to just report the error to the user. The problem is that we don't just want to crash the application. We want such errors to have as isolated effect as possible and to be handled in a clean way. The application should be able to log the error and continue without compromising the integrity of the VM.
I will think about this.
>If you take a performance hit, at least it should be
>worth taking a hit over..
Having a predictable environment that never crashes and is always completely diagnosable is a worthy and interesting goal. It is not that expensive too, considering that many popular safe language in use today (Perl, Ruby, Python) are an order of magnitude slower than Java and C#.
Ultimately we might disagree on the usefulness of safe languages, but take it as an intellectual exercise :-)