Sequential consistency in hardware

By: Linus Torvalds (, August 3, 2020 9:19 am
Room: Moderated Discussions
never_released ( on August 3, 2020 8:44 am wrote:
> It's noted in the TRM as:

Ugh. Another "register to access" technical reference manual.

ARM got the memo, apparently nvidia hasn't. It's a big pain, partly because I just don't like registering with companies, but mostly because it also means those things don't get indexed by search engines.

So all downsides, for absolutely no reason. Silly.

Anyway, just from your quote:

> > For coherent memory types, Carmel cores provide a single, sequentially consistent view of coherent memory.
> > Accordingly if no non-coherent access, Cache maintenance or TLB maintenance instruction has been executed
> > since the last memory barrier, memory barriers behave similarly to a single-cycle NOP.

Interesting. I'd have thought that sequential consistency wouldn't be worth the pain, and would cost more than it gives.

But maybe nVidia ended up figuring something clever out.

> What are the advantages of having that guarantee provided by hardware more than just
> having TSO in practice? Are there cases where it's considered as more useful?

I see two options:

(a) nVidia is selling into a space where this is a major reliability concern.

(b) nVidia figured out that the way they do the store queue, it doesn't add noticeable cost, and the cheap barriers make it worth it

I think (a) may be the stronger argument. IOW, in some areas (automotive etc), regulatory and safety concerns about verification may make SQ a noticeable selling point. It might be a lot easier to validate some parts of the system if you can take SQ as a given, and don't have to worry about some of the things that happen in other consistency models.

But (b) isn't entirely unlikely either. Look up "MIT Tardis" and projects like that. People have been working on things that give you consistency without the traditional costs, and as threading becomes more common and synchronization can become a big deal in possibly performance-critical areas, maybe nVidia ended up finding that speeding up synchronization operations was a net win.

People used to believe that the fewer memory ordering guarantees you gave, the simpler you could make things, and the better everything would work. That turned out to not be true.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Sequential consistency in hardwarenever_released2020/08/03 07:44 AM
  Sequential consistency in hardwareLinus Torvalds2020/08/03 09:19 AM
    Sequential consistency in hardwareJon Masters2020/08/03 04:22 PM
      Sequential consistency in hardwareGeert Bosch2020/08/03 07:48 PM
        Sequential consistency in hardwareTravis Downs2020/08/03 08:08 PM
          Sequential consistency in hardwareLinus Torvalds2020/08/03 10:20 PM
            Sequential consistency in hardwareLinus Torvalds2020/08/04 11:56 AM
              Sequential consistency in hardwarenever_released2020/08/04 02:03 PM
            Sequential consistency in hardwareVeedrac2020/08/05 11:54 AM
              Sequential consistency in hardwareDoug S2020/08/05 02:36 PM
                Sequential consistency in hardwareanon22020/08/05 03:06 PM
          Sequential consistency in hardwareAnon2020/08/04 07:02 AM
        Sequential consistency in hardwaredmcq2020/08/04 09:27 AM
          Sequential consistency in hardwareKonrad Schwarz2020/08/05 05:03 AM
  Sequential consistency in hardwareTravis Downs2020/08/03 06:58 PM
    Sequential consistency in hardwaregpd2020/08/04 02:19 AM
    Sequential consistency in hardwareJeff S.2020/08/04 10:11 PM
      Sequential consistency in hardwareTravis Downs2020/08/05 12:04 PM
        Sequential consistency in hardwareJeff S.2020/08/05 02:52 PM
          typoJeff S.2020/08/05 02:55 PM
          Sequential consistency in hardwareTravis Downs2020/08/05 06:39 PM
            Sequential consistency in hardwareJeff S.2020/08/05 07:43 PM
  Binary translationDavid Kanter2020/08/03 08:19 PM
Reply to this Topic
Body: No Text
How do you spell avocado?