By: dmcq (dmcq.delete@this.fano.co.uk), July 9, 2015 4:37 am
Room: Moderated Discussions
Mark Roulo (nothanks.delete@this.xxx.com) on July 8, 2015 4:54 pm wrote:
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 8, 2015 1:38 pm wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on July 8, 2015 10:11 am wrote:
> > >
> > > If Linux copes with weak memory ordering and just using barriers then that is good. If a person wants to
> > > use shared data they should signal to the hardware that they are doing that - it should not be a general
> > > overhead.
> >
> > You completely missed my argument.Apparently the whole "testing is meaningful" passed you by with no impact.
> >
> > > The design of the hardware is becoming more automated and straightforward rules are what are needed.
> >
> > .. I would say that _stricter_ rules are needed. We have a history of CPU's starting off with really badly
> > designed and weak rules in general, and every single time
> > that tends to be a big mistake. Undefined behavior
> > in a CPU is bad. And weak memory ordering pretty much ends up being "undefined behavior".
> >
> > Now, would I like memory ordering even stricter than what
> > Intel gives me? Yeah, that would be lovely. I think
> > s390 ends up actually doing that. But I'll take what I can
> > get, and I'll point out the relative strengths and
> > weaknesses. The x86 is a relatively strong model that helps make some orderign issues much easier to handle.
> >
> > (In particular, causality is something that people don't think about, because we take it
> > for granted. When your memory ordering breaks causality, I guarantee you that people can
> > look at code that is "obviously correct" and have a really hard time debugging it)
>
> A real-world example of this is double-checked locking. It was an obvious
> performance improvement for singletons in Java and it also worked fine most
> of the time.
>
> Where most of the time meant:
> a) Almost all the time on all hardware/OS/JVM/JIT combinations.
> b) ALL the time on some combinations ... including the popular ones (I can't reproduce your bug ...)
>
> I'll note that folks with CS backgrounds thought double-checked locking was a
> good idea (for a while). One example is here:
>
> http://www.cs.wustl.edu/~schmidt/PDF/DC-Locking.pdf
>
> Yes, the Douglas Schmidt in this paper is *the* Doug Schmidt.
>
> A nice explanation of why this doesn't work so well is here:
>
> http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
>
> The point here being that *legal* behavior that is subtle, counter-intuitive and/or rare is bad.
> Yes, the code is buggy. And some very talented folks made this mistake.
>
> But in this case the combination of human expectations and hardware design
> choices make bugs like this more likely. Especially for average programmers.
>
> That is a bad combination, even if we can put the blame on someone besides
> the CPU architects.
>
> Designing systems to be less likely to lead to bugs is a good thing. Not the *only*
> good thing, and we won't be willing to give up an unlimited amount of performance for
> it. But a good thing none the less.
I fully agree that
Designing systems to be less likely to lead to bugs is a good thing.
Human expectation and hardware design can make bugs more likely.
Where I totally differ from Linus is the lesson to be gained from examples like this
His solution is to try and make the things people have already done less likely to produce errors.
My preference is to make things people in the future do less likely to produce errors.
These are quite different things. Making what people have already done less likely to produce errors means they have raised expectations that code like the example above will work.
I would like those expectations to be removed by not even saying that data dependency is supported. If they want to share data they need to mark that out in a clean fashion or it very possibly won't work some time in the future. Just forget all those CS papers showing lists of loads and stores by different processors and the various orders they might appear to execute in. They are a menace to even good programmers.
If things like Linux RCU are worthwhile then the essential parts of them should be analysed to say what is the simple thing they do that can be abstracted. The lesson there is not that data dependency is a good thing to have for all memory operations. That is like saying wearing a helmet is good because it can save one's life on a motorcycle, therefor one should always wear a helmet even when sleeping in bed. It is a silly requirement most of the time.
We need properly designed hardware that provides what is needed when it is needed. We don't need general facilities that encourage silly tricks and tie down the hardware unnecessarily. There's various things that are wanted -- easy ways of doing JIT code, of coping with streams of I/O data, of handling lists, of discarding pages which are shared, statistics gathering - there's a number of them and they have become quite familiar. But talking about sequential consistency and lists of reads and writes is just asking for trouble, that sort of thing should be the domain of hardware designers using logic checkers.
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 8, 2015 1:38 pm wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on July 8, 2015 10:11 am wrote:
> > >
> > > If Linux copes with weak memory ordering and just using barriers then that is good. If a person wants to
> > > use shared data they should signal to the hardware that they are doing that - it should not be a general
> > > overhead.
> >
> > You completely missed my argument.Apparently the whole "testing is meaningful" passed you by with no impact.
> >
> > > The design of the hardware is becoming more automated and straightforward rules are what are needed.
> >
> > .. I would say that _stricter_ rules are needed. We have a history of CPU's starting off with really badly
> > designed and weak rules in general, and every single time
> > that tends to be a big mistake. Undefined behavior
> > in a CPU is bad. And weak memory ordering pretty much ends up being "undefined behavior".
> >
> > Now, would I like memory ordering even stricter than what
> > Intel gives me? Yeah, that would be lovely. I think
> > s390 ends up actually doing that. But I'll take what I can
> > get, and I'll point out the relative strengths and
> > weaknesses. The x86 is a relatively strong model that helps make some orderign issues much easier to handle.
> >
> > (In particular, causality is something that people don't think about, because we take it
> > for granted. When your memory ordering breaks causality, I guarantee you that people can
> > look at code that is "obviously correct" and have a really hard time debugging it)
>
> A real-world example of this is double-checked locking. It was an obvious
> performance improvement for singletons in Java and it also worked fine most
> of the time.
>
> Where most of the time meant:
> a) Almost all the time on all hardware/OS/JVM/JIT combinations.
> b) ALL the time on some combinations ... including the popular ones (I can't reproduce your bug ...)
>
> I'll note that folks with CS backgrounds thought double-checked locking was a
> good idea (for a while). One example is here:
>
> http://www.cs.wustl.edu/~schmidt/PDF/DC-Locking.pdf
>
> Yes, the Douglas Schmidt in this paper is *the* Doug Schmidt.
>
> A nice explanation of why this doesn't work so well is here:
>
> http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
>
> The point here being that *legal* behavior that is subtle, counter-intuitive and/or rare is bad.
> Yes, the code is buggy. And some very talented folks made this mistake.
>
> But in this case the combination of human expectations and hardware design
> choices make bugs like this more likely. Especially for average programmers.
>
> That is a bad combination, even if we can put the blame on someone besides
> the CPU architects.
>
> Designing systems to be less likely to lead to bugs is a good thing. Not the *only*
> good thing, and we won't be willing to give up an unlimited amount of performance for
> it. But a good thing none the less.
I fully agree that
Designing systems to be less likely to lead to bugs is a good thing.
Human expectation and hardware design can make bugs more likely.
Where I totally differ from Linus is the lesson to be gained from examples like this
His solution is to try and make the things people have already done less likely to produce errors.
My preference is to make things people in the future do less likely to produce errors.
These are quite different things. Making what people have already done less likely to produce errors means they have raised expectations that code like the example above will work.
I would like those expectations to be removed by not even saying that data dependency is supported. If they want to share data they need to mark that out in a clean fashion or it very possibly won't work some time in the future. Just forget all those CS papers showing lists of loads and stores by different processors and the various orders they might appear to execute in. They are a menace to even good programmers.
If things like Linux RCU are worthwhile then the essential parts of them should be analysed to say what is the simple thing they do that can be abstracted. The lesson there is not that data dependency is a good thing to have for all memory operations. That is like saying wearing a helmet is good because it can save one's life on a motorcycle, therefor one should always wear a helmet even when sleeping in bed. It is a silly requirement most of the time.
We need properly designed hardware that provides what is needed when it is needed. We don't need general facilities that encourage silly tricks and tie down the hardware unnecessarily. There's various things that are wanted -- easy ways of doing JIT code, of coping with streams of I/O data, of handling lists, of discarding pages which are shared, statistics gathering - there's a number of them and they have become quite familiar. But talking about sequential consistency and lists of reads and writes is just asking for trouble, that sort of thing should be the domain of hardware designers using logic checkers.