By: dmcq (dmcq.delete@this.fano.co.uk), July 16, 2015 2:46 am
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 15, 2015 6:35 pm wrote:
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 15, 2015 2:43 pm wrote:
> >
> > So is this reordering causal according to you? I bet most people don't find it intuitive
> > either as this kind of reordering is exactly what is expected on weak memory models.
>
> So that re-ordering is indeed causal, although it is not sequentially consistent.
>
> There is absolutely no question that
>
> (a) sequential consistency is the easiest one to understand (basically: "everything is ordered")
>
> (b) x86 is famously strongly ordered, but it is not that strongly ordered.
>
> A causal consistency involves a causal relationship: "if A then B". That means you have a read of a variable
> followed by a write to (possibly another) variable. The read is the conditional, the write is the action.
>
> It's called "causal" since the read of the first variable could be the cause of the write to the
> second one. That cause can be in many ways - it could affect whether the write is done at all, it
> could affect the actual value written, and it could affect which variable to be written.
>
> So a causal chain would be something like
>
> IF (A) THEN B=1
>
> and causal consistency means that if another CPU now sees "B" having the value 1, then
> it had better see that A was true too. In memory ordering documentations you would skip
> the "if/then" part and only talk about the actual accesses, so you have a pattern of
>
> read A
> B=1
>
> to show that causal relationship, and then the rule would be that another CPU that does
>
> read B
> read A
>
> would never see "B and !A" (because B having the value 1 is caused by A having a non-zero value).
>
> In contrast, your example is about sequential consistency,
> not causal consistency: when you have the reverse pattern
>
> A=1
> read B
>
> (and on another CPU, doing the same thing, except switching the variables: "B=1" followed
> by "read A") clearly neither CPU had the actions they did be "caused" by anything.
>
> They didn't read anything from memory that could affect the end result: they
> did an unconditional store, and then a unconditional read. So there is no "if
> A then B" causal relationship - they are sequential, but not causal.
>
> So "sequential consistency" means that everybody agrees about sequential ordering. "Causal consistency"
> is weaker than that, and means that just everybody agrees about causal ordering.
>
> Again, I do think that sequential memory consistency is by far the easiest one to
> understand, and the one that would result in the fewest bugs. If I had the choice,
> as a software person I'd pick sequential consistency. No argument about it.
>
> And again, x86 is absolutely not sequentially consistent. Never has been. The write buffering means very
> fundamentally that stores will be delayed past later reads, and that thus stores are clearly not sequentially
> ordered with reads. And it makes it very easy to see that behavior that you see in that example.
>
> So I've never claimed that x86 was perfect. All I've claimed is
> that the x86 memory ordering is better. It's not "best".
>
> Engineering side note: I also do claim that from a pure engineering standpoint, the whole "buffer
> stores" model is at least a reasonable trade-off. Truly sequentially consistent memory ordering
> is really really hard to make go fast. It's simply not practical. So while I as a software person
> would find sequential consistency much easier to think about, and never have to worry about memory
> re-ordering at all, I do see the huge advantage you get from having a store buffer.
>
> In contrast, the advantages of weakening the memory model further are much less obvious.
> Stores really have to be delayed for good performance. No other memory re-ordering matters
> nearly as much. Re-ordering loads past loads? Not nearly as big a deal. Just move the first
> load even earlier. Re-ordering stores past stores? No biggie. They're buffered and not critical
> anyway, so from a performance standpoint it's pretty much a no-op in the big picture.
>
> (And yes, that's certainly simplifying a bit. A ordered store buffer might need to be bigger
> than an unordered one, and might not as easily catch some merging behavior. Similarly, re-ordering
> reads past other reads might make for smaller queues. But those are second-order effects
> at best, not at all the kind of fundamental issue that is "buffer stores").
>
> Linus
I believe the write release/ load acquire model is a causal one like what Linus is saying rather than a sequentially consistent one, and that even the following
A=0 B=0
release A=1 release B=1
acquire B acquire A
could result in A and B both being zero. They really are designed to do what they say and not to be used for complicated logical reasonings.
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 15, 2015 2:43 pm wrote:
> >
> > So is this reordering causal according to you? I bet most people don't find it intuitive
> > either as this kind of reordering is exactly what is expected on weak memory models.
>
> So that re-ordering is indeed causal, although it is not sequentially consistent.
>
> There is absolutely no question that
>
> (a) sequential consistency is the easiest one to understand (basically: "everything is ordered")
>
> (b) x86 is famously strongly ordered, but it is not that strongly ordered.
>
> A causal consistency involves a causal relationship: "if A then B". That means you have a read of a variable
> followed by a write to (possibly another) variable. The read is the conditional, the write is the action.
>
> It's called "causal" since the read of the first variable could be the cause of the write to the
> second one. That cause can be in many ways - it could affect whether the write is done at all, it
> could affect the actual value written, and it could affect which variable to be written.
>
> So a causal chain would be something like
>
> IF (A) THEN B=1
>
> and causal consistency means that if another CPU now sees "B" having the value 1, then
> it had better see that A was true too. In memory ordering documentations you would skip
> the "if/then" part and only talk about the actual accesses, so you have a pattern of
>
> read A
> B=1
>
> to show that causal relationship, and then the rule would be that another CPU that does
>
> read B
> read A
>
> would never see "B and !A" (because B having the value 1 is caused by A having a non-zero value).
>
> In contrast, your example is about sequential consistency,
> not causal consistency: when you have the reverse pattern
>
> A=1
> read B
>
> (and on another CPU, doing the same thing, except switching the variables: "B=1" followed
> by "read A") clearly neither CPU had the actions they did be "caused" by anything.
>
> They didn't read anything from memory that could affect the end result: they
> did an unconditional store, and then a unconditional read. So there is no "if
> A then B" causal relationship - they are sequential, but not causal.
>
> So "sequential consistency" means that everybody agrees about sequential ordering. "Causal consistency"
> is weaker than that, and means that just everybody agrees about causal ordering.
>
> Again, I do think that sequential memory consistency is by far the easiest one to
> understand, and the one that would result in the fewest bugs. If I had the choice,
> as a software person I'd pick sequential consistency. No argument about it.
>
> And again, x86 is absolutely not sequentially consistent. Never has been. The write buffering means very
> fundamentally that stores will be delayed past later reads, and that thus stores are clearly not sequentially
> ordered with reads. And it makes it very easy to see that behavior that you see in that example.
>
> So I've never claimed that x86 was perfect. All I've claimed is
> that the x86 memory ordering is better. It's not "best".
>
> Engineering side note: I also do claim that from a pure engineering standpoint, the whole "buffer
> stores" model is at least a reasonable trade-off. Truly sequentially consistent memory ordering
> is really really hard to make go fast. It's simply not practical. So while I as a software person
> would find sequential consistency much easier to think about, and never have to worry about memory
> re-ordering at all, I do see the huge advantage you get from having a store buffer.
>
> In contrast, the advantages of weakening the memory model further are much less obvious.
> Stores really have to be delayed for good performance. No other memory re-ordering matters
> nearly as much. Re-ordering loads past loads? Not nearly as big a deal. Just move the first
> load even earlier. Re-ordering stores past stores? No biggie. They're buffered and not critical
> anyway, so from a performance standpoint it's pretty much a no-op in the big picture.
>
> (And yes, that's certainly simplifying a bit. A ordered store buffer might need to be bigger
> than an unordered one, and might not as easily catch some merging behavior. Similarly, re-ordering
> reads past other reads might make for smaller queues. But those are second-order effects
> at best, not at all the kind of fundamental issue that is "buffer stores").
>
> Linus
I believe the write release/ load acquire model is a causal one like what Linus is saying rather than a sequentially consistent one, and that even the following
A=0 B=0
release A=1 release B=1
acquire B acquire A
could result in A and B both being zero. They really are designed to do what they say and not to be used for complicated logical reasonings.