Sequential consistency in hardware

By: Linus Torvalds (, August 3, 2020 10:20 pm
Room: Moderated Discussions
Travis Downs ( on August 3, 2020 9:08 pm wrote:
> Memory models are not like that: you need to be tracking everything,
> all the time, since you don't know when a reordering will happen.

The argument is that CPU's may be starting to do that anyway for other reasons, and once you have that tracking, the advantage of a weaker memory model just doesn't exist.

IOW, the advantage was always "simpler silicon", and people have taken that advantage as gospel truth (and some still do). But once silicon has the complexity, the actual advantage goes away, but the disadvantages remain.

So to take another example where you do need to track everything, and do it for every cycle: register dependencies and out-of-order execution.

People literally used to argue that that complexity is too expensive, and that it's better to make simpler and faster CPU's. Not just on this board.

Those people hopefully admit today that they were wrong. But the important part to face is not that they are wrong today. They were fundamentally wrong ten years ago too. The advantage of simpler silicon is just that: simpler silicon. But while that can translate into better performance, it's not a given, and it's not some kind of fundamental and self-evident truth.

So the question is not whether you need to track everything every cycle, and whether that may make things too expensive. It might be worth the expense, and the complexity may become something you learn to deal with over time and eventually take for granted.

Cache coherency is another of those things that was "too expensive", as Geert pointed out. These days? It turns out that not being cache coherent can be the truly expensive part, and basically unacceptable for high-performance IO because it means that you don't get the advantages of caching between your CPU and your IO device.

In the case of nVidia, who knows? I don't have any inside knowledge, but if they were working on something like transactional memory due to their binary translation efforts, they may have ended up with memory units where sequential consistency simply falls out of that work.

At that point it's not an "expense" any more. At that point it's suddenly possibly a performance advantage.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Sequential consistency in hardwarenever_released2020/08/03 07:44 AM
  Sequential consistency in hardwareLinus Torvalds2020/08/03 09:19 AM
    Sequential consistency in hardwareJon Masters2020/08/03 04:22 PM
      Sequential consistency in hardwareGeert Bosch2020/08/03 07:48 PM
        Sequential consistency in hardwareTravis Downs2020/08/03 08:08 PM
          Sequential consistency in hardwareLinus Torvalds2020/08/03 10:20 PM
            Sequential consistency in hardwareLinus Torvalds2020/08/04 11:56 AM
              Sequential consistency in hardwarenever_released2020/08/04 02:03 PM
            Sequential consistency in hardwareVeedrac2020/08/05 11:54 AM
              Sequential consistency in hardwareDoug S2020/08/05 02:36 PM
                Sequential consistency in hardwareanon22020/08/05 03:06 PM
          Sequential consistency in hardwareAnon2020/08/04 07:02 AM
        Sequential consistency in hardwaredmcq2020/08/04 09:27 AM
          Sequential consistency in hardwareKonrad Schwarz2020/08/05 05:03 AM
  Sequential consistency in hardwareTravis Downs2020/08/03 06:58 PM
    Sequential consistency in hardwaregpd2020/08/04 02:19 AM
    Sequential consistency in hardwareJeff S.2020/08/04 10:11 PM
      Sequential consistency in hardwareTravis Downs2020/08/05 12:04 PM
        Sequential consistency in hardwareJeff S.2020/08/05 02:52 PM
          typoJeff S.2020/08/05 02:55 PM
          Sequential consistency in hardwareTravis Downs2020/08/05 06:39 PM
            Sequential consistency in hardwareJeff S.2020/08/05 07:43 PM
  Binary translationDavid Kanter2020/08/03 08:19 PM
Reply to this Topic
Body: No Text
How do you spell avocado?