Why does writing to non-sequential lines in L2 perform so poorly?

By: anon (anon.delete@this.ymous.net), December 22, 2017 3:29 am
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on December 21, 2017 11:12 am wrote:
>
> So my thinking is that the behavior you see might be because
>
> (1) the store buffer drains purely to the L2, and the L2 is the real cache coherency
> boundary for external cores. The store ordering is easy to maintain because the stores
> really drain in order (although fetching the L2 lines can obviously be entirely OoO).

When you say "real cache coherency boundary", you still agree that lines in L1D still need to be invalidated, so either L2 is inclusive of L1D or invalidations will also go to L1D, correct?

>
> (2) but to keep the L1 up-to-date, the L1 is updated separately from
> (and concurrently with) the store buffer if the line exists in there.


Sounds like this would behave like "write no allocate, write through" for L1D. Intel has always advertised writeback L1D (and many have raised an eyebrow at AMD for using write through in Bulldozer...), so I'd be surprised if they were doing something like this.

It sounds like it would work, but what are you gaining by doing this? Store Buffer will drain to L2 in order, and that might take a while. Therefore, writing the L1D "in advance" will not allow you to deallocate Store Buffer entries earlier, because L2 still needs to be updated. Moreover, the data is in the SB for any load to get via forwarding, so I'm not sure there is an advantage to eagerly write into the L1D rather than in lockstep with the L2 write (in this context where L2 is the coherency boundary). Maybe I misunderstood the flow.

>
> (3) but because the L1 is visible to at least HT cores, store ordering is an issue,
> and the L1 update has to happen in order with any stores that missed in the L1,
> because otherwise at least a HT core could see writes in the wrong order.
>
> Anyway. I may be completely off, I'm just throwing this out as a possible reason for the odd timings you see.
> It might be interesting to test with HT on vs HT off, because I think any "L1 access order visibility" really
> might be limited to only the HT case, because normally I thought that Intel limited snoop to L2 and out.
>

That's a an interesting point. TSO allows the processor performing a write to observe its own write before others, but if coherency is handled from L2 onwards, you might write into L1D before writing in L2 and that would be okay because only the CPU that did the write would see those writes. It looks okay but I'm not 100% convinced that there does not exist a corner case where this does not work...HT is one, as you mentioned :)



< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/20 02:44 PM
  Bridges? Wells? (NT)Micahel S2017/12/20 03:53 PM
    Bridges? Wells? (NT)Travis2017/12/20 04:46 PM
      That should say "huh"? (NT)Travis2017/12/20 04:46 PM
        That should say "huh"?Jeff S.2017/12/20 05:11 PM
          That should say "huh"?Travis2017/12/20 06:34 PM
    Bridges? Wells?Jeff S.2017/12/20 05:17 PM
      Bridges? Wells?Travis2017/12/20 06:37 PM
    Bridges, Wells - positiveMichael S2017/12/21 02:52 AM
      Bridges, Wells - positiveTravis2017/12/21 09:35 AM
        Bridges, Wells - positiveMichael S2017/12/21 10:00 AM
  Why does writing to non-sequential lines in L2 perform so poorly?Linus Torvalds2017/12/20 06:18 PM
    Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/20 06:54 PM
      Why does writing to non-sequential lines in L2 perform so poorly?Linus Torvalds2017/12/21 12:12 PM
        Why does writing to non-sequential lines in L2 perform so poorly?anon2017/12/22 03:29 AM
          Why does writing to non-sequential lines in L2 perform so poorly?Linus Torvalds2017/12/22 01:16 PM
            Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/23 08:48 PM
            Why does writing to non-sequential lines in L2 perform so poorly?Travis Downs2020/06/13 03:18 PM
              Why does writing to non-sequential lines in L2 perform so poorly?John D. McCalpin2020/06/18 12:50 PM
                Why does writing to non-sequential lines in L2 perform so poorly?Travis Downs2020/06/18 05:32 PM
                  Why does writing to non-sequential lines in L2 perform so poorly?Travis Downs2020/06/18 05:34 PM
    Why does writing to non-sequential lines in L2 perform so poorly?anon.12017/12/21 06:09 PM
      Why does writing to non-sequential lines in L2 perform so poorly?Linus Torvalds2017/12/22 01:20 PM
        Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/24 02:09 PM
  Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/20 08:52 PM
    Why does writing to non-sequential lines in L2 perform so poorly?Adrian2017/12/21 12:09 AM
      Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/21 09:23 AM
    Why does writing to non-sequential lines in L2 perform so poorly?-.-2017/12/27 03:53 AM
      Why does writing to non-sequential lines in L2 perform so poorly?-.-2017/12/27 03:53 AM
        Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/27 04:18 PM
  Why does writing to non-sequential lines in L2 perform so poorly?Etienne2017/12/21 02:36 AM
    Why does writing to non-sequential lines in L2 perform so poorly?Michael S2017/12/21 02:58 AM
      Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/21 09:26 AM
        Michael ignore my last question - saw your other reply (NT)Travis2017/12/21 09:27 AM
  Why does writing to non-sequential lines in L2 perform so poorly?Nksingg2017/12/26 06:47 AM
    Why does writing to non-sequential lines in L2 perform so poorly?David Kanter2017/12/26 11:48 AM
    Why does writing to non-sequential lines in L2 perform so poorly?Travis2017/12/27 04:33 PM
  Cannot reproduce with microcode 0xc6Travis Downs2019/02/26 04:23 PM
    Cannot reproduce with microcode 0xc6Adrian2019/02/26 09:35 PM
    Cannot reproduce with microcode 0xc6Adrian2019/02/26 10:07 PM
    Cannot reproduce with microcode 0xc6Adrian2019/02/27 05:02 AM
      Cannot reproduce with microcode 0xc6Travis Downs2019/02/27 08:25 AM
        Cannot reproduce with microcode 0xc6Adrian2019/02/28 01:16 AM
          Cannot reproduce with microcode 0xc6Travis Downs2019/03/07 06:51 PM
        Cannot reproduce with microcode 0xc6Adrian2019/02/28 09:54 AM
          Cannot reproduce with microcode 0xc6Travis Downs2019/03/24 06:34 PM
    Cannot reproduce with microcode 0xc6Travis Downs2019/02/27 03:20 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?