By: Linus Torvalds (torvalds.delete@this.linux-foundation.org),
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on September 15, 2015 8:46 am wrote:
>
> Weak memory orderings are bad. Barriers are not a feature to
> be encouraged, they are mis-features to be lamented and made excuses for.
In other words, rather than your default "everything should be weakly ordered, and you software suckers have to add barriers to random places", the hardware people should say:"we really cannot make this particular ordering go fast, so here's how you handle it, and here's why we cannot make it go fast without your help".
This is why I'd claim that architectures should never ever need barriers between earlier reads and later writes. There just isn't any conceivable performance advantage of delaying the read, or trying to move the write up. It just simply does not help. A piece of hardware that re-orders those two accesses without a barrier is just gratuitously doing stupid things.
Your memory pipeline already has to track way more complicated orderings just for the single-threaded case, the "I'll do reads before subsequent writes" really isn't a big stretch.
Same goes for the read dependency barrier. It just doesn't really help hardware, and it very much does hurt software.
And honestly, the same largely goes for write barriers. Those things are queued anyway, and aren't performance-critical. The main performance win comes from just merging temporally adjacent writes (think things like memory copies or clear, or even just building a frame on the stack), which you can do without re-ordering the writes. You might as well keep the write queue a queue, make it slightly bigger, and tell people they don't need write barriers.
Read vs read without a dependency? Now we're starting to approach the kind of situation where a hardware designer can actually articulate some very deep reason to say "we'd actually like to re-order those unless you tell us not to". At least they can make a good excuse for why that's the case.
Not buffering a write across a later read? Now that a hardware designer can make a really good argument for needing a barrier, because even the absolute simplest implementation with trivial buffers etc would want to re-order things.
See? Instead of your insane "barriers are inherently good", the rule should be "barriers need a damn good actual design reason for them, because they are a pain for software". Just "we may want to do insane things in the future" is not a good enough reason.
Linus
>
> Weak memory orderings are bad. Barriers are not a feature to
> be encouraged, they are mis-features to be lamented and made excuses for.
In other words, rather than your default "everything should be weakly ordered, and you software suckers have to add barriers to random places", the hardware people should say:"we really cannot make this particular ordering go fast, so here's how you handle it, and here's why we cannot make it go fast without your help".
This is why I'd claim that architectures should never ever need barriers between earlier reads and later writes. There just isn't any conceivable performance advantage of delaying the read, or trying to move the write up. It just simply does not help. A piece of hardware that re-orders those two accesses without a barrier is just gratuitously doing stupid things.
Your memory pipeline already has to track way more complicated orderings just for the single-threaded case, the "I'll do reads before subsequent writes" really isn't a big stretch.
Same goes for the read dependency barrier. It just doesn't really help hardware, and it very much does hurt software.
And honestly, the same largely goes for write barriers. Those things are queued anyway, and aren't performance-critical. The main performance win comes from just merging temporally adjacent writes (think things like memory copies or clear, or even just building a frame on the stack), which you can do without re-ordering the writes. You might as well keep the write queue a queue, make it slightly bigger, and tell people they don't need write barriers.
Read vs read without a dependency? Now we're starting to approach the kind of situation where a hardware designer can actually articulate some very deep reason to say "we'd actually like to re-order those unless you tell us not to". At least they can make a good excuse for why that's the case.
Not buffering a write across a later read? Now that a hardware designer can make a really good argument for needing a barrier, because even the absolute simplest implementation with trivial buffers etc would want to re-order things.
See? Instead of your insane "barriers are inherently good", the rule should be "barriers need a damn good actual design reason for them, because they are a pain for software". Just "we may want to do insane things in the future" is not a good enough reason.
Linus


