By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), July 30, 2020 9:09 am
Room: Moderated Discussions
Maynard Handley (name99.delete@this.name99.org) on July 29, 2020 9:30 pm wrote:
>
> Let's hope someone figures out how to run standard macOS (and thus some large benchmarks)
> in either mode. Would allow us to quantify in performance terms (putting aside other issues
> like design and validations costs) the win from adopting a weaker memory model...
Not really.
Because the likely thing is that the design simply isn't optimized for x86-TSO, so setting the MSR will just make a lot of memory operations go into a slow mode.
In fact, I'd guess it mostly affects just the decoder, and makes it just turn every LDR into LDAR, and every STR into STLR. That's a fairly trivial thing, and should pretty much get you the x86 semantics.
Then the question becomes how much they bothered to actually optimize their memory units for LDAR/STLR. And the answer may be "not much".
So it doesn't tell us all that much in the big picture, it only says "for this chip and this load it results in X".
Trying to run very low-level benchmarks of particular sequences (ie the kinds of things Travis does in his x86 uarch tests) might show particular areas that Apple ended up working on a lot (or that end up being particularly nasty or simple for memory ordering differences). That might be quite interesting to see.
Linus
>
> Let's hope someone figures out how to run standard macOS (and thus some large benchmarks)
> in either mode. Would allow us to quantify in performance terms (putting aside other issues
> like design and validations costs) the win from adopting a weaker memory model...
Not really.
Because the likely thing is that the design simply isn't optimized for x86-TSO, so setting the MSR will just make a lot of memory operations go into a slow mode.
In fact, I'd guess it mostly affects just the decoder, and makes it just turn every LDR into LDAR, and every STR into STLR. That's a fairly trivial thing, and should pretty much get you the x86 semantics.
Then the question becomes how much they bothered to actually optimize their memory units for LDAR/STLR. And the answer may be "not much".
So it doesn't tell us all that much in the big picture, it only says "for this chip and this load it results in X".
Trying to run very low-level benchmarks of particular sequences (ie the kinds of things Travis does in his x86 uarch tests) might show particular areas that Apple ended up working on a lot (or that end up being particularly nasty or simple for memory ordering differences). That might be quite interesting to see.
Linus
Topic | Posted By | Date |
---|---|---|
Apple Silicon switchable memory model | Travis Downs | 2020/07/29 07:59 PM |
Apple Silicon switchable memory model | Maynard Handley | 2020/07/29 08:30 PM |
Apple Silicon switchable memory model | anon | 2020/07/29 10:37 PM |
Apple Silicon switchable memory model | RichardC | 2020/07/30 05:52 AM |
Apple Silicon switchable memory model | Linus Torvalds | 2020/07/30 09:09 AM |
Apple Silicon switchable memory model | anon2 | 2020/07/30 02:39 PM |
Apple Silicon switchable memory model | Linus Torvalds | 2020/07/30 03:00 PM |
Apple Silicon switchable memory model | Maynard Handley | 2020/07/30 05:46 PM |
Apple Silicon switchable memory model | Linus Torvalds | 2020/07/30 06:02 PM |
Apple Silicon switchable memory model | Maynard Handley | 2020/07/30 07:15 PM |
Apple Silicon switchable memory model | Niels Jørgen Kruse | 2020/07/31 12:23 AM |
Apple Silicon switchable memory model | Maynard Handley | 2020/07/31 09:37 AM |
Apple Silicon switchable memory model | anon2 | 2020/07/30 07:56 PM |
Apple Silicon switchable memory model | none | 2020/07/30 09:45 PM |
Apple Silicon switchable memory model | Linus Torvalds | 2020/07/31 08:33 AM |
No it wouldn't | David Kanter | 2020/07/30 07:41 PM |
System wide property? | Mark Roulo | 2020/07/30 09:15 AM |
System wide property? | Linus Torvalds | 2020/07/30 10:43 AM |
System wide property? | Linus Torvalds | 2020/07/30 10:50 AM |
System wide property? | Mark Roulo | 2020/07/30 11:36 AM |
System wide property? | Doug S | 2020/07/30 02:29 PM |
System wide property? | Ungo | 2020/07/31 12:51 AM |
System wide property? | Maynard Handley | 2020/07/31 09:43 AM |
System wide property? | dmcq | 2020/08/01 12:56 AM |