Coherency: Forwarding and Owned

Article: The Common System Interface: Intel's Future Interconnect
By: David Kanter (dkanter.delete@this.realworldtech.com), September 16, 2007 4:34 pm
Room: Moderated Discussions
Peter Gerdes (truepath@infiniteinjury.org) on 9/16/07 wrote:
---------------------------
>Hey, thanks for your paitence. I end up using some caps >below for emphasis, they aren't meant to express >frustration.

Current K8 systems look like this:

0 - 1
| |
2 - 3

Where 0-3 are processors with attached memory. Future systems from both Intel and AMD will look like:

0 - 1
| x |
2 - 3

>Now the question being asked is when does MOESI cache coherency protocol offer an advantage over a MESIF >protocol.
>
>Well the benefit of the O state is that if chip A has a modified cache line (call
>it L) and both chip's B and C request to read that cache line then chip A can trasition
>L to the state O and pass it off to B and C without doing a write. My claim is
>that in the vast majority of cases the MESIF protocol can do exactly (or almost exactly) the same thing.

OK, so the conditions are actually weaker than this. Imagine that A has a modified line, and then B requests it to read. In MESIF, this could be resolved two ways:

Solution 1:
Evict cache line from A and send to B as modified

Solution 2:
Write back to memory
Send line to B as shared, and switch line in A to shared

Note that solution 1 doesn't work if many people would like to request the line.

>Why? So suppose that cache line L (in A's cache) corresponds to a memory location
>in bank A and that as above B and C request to both read that location. The MESIF
>protocol requires that A first write it's line out to memory and transitioning to
>the F state before passing it on to B and C. But SINCE THE MEMORY CONTROLLER THAT
>WOULD WRITE L TO MEMORY IS INTEGRATED INTO CHIP A NO ACTUAL WRITE HAS TO TAKE PLACE.

Actually it does. You can't just pretend not to write stuff to memory, or you'll be totally screwed if you get an uncorrectable bit flip in your cache.

>In other words A just *immediately* tells the other chips that it has written L
>to memory and hands out the cache line to B and C making an *internal* mark to write
>L to memory before eliminating it from it's own cache. Correctness is guaranteed
>because the only way any other chip can read or write to the memory backing L is through chip A.

But now you have to pin a cache line into memory (which x86s can't do), what happens if that line gets evicted? Then it has to be written back to memory. No matter what, you are creating a lot of complexity for yourself this way.

>So what if L isn't backed by memory A controls? Let's suppose that L is backed
>by memory that B controls. Well in this case when B requests the cache line A already
>has to transfer that cache line over to B. Therefore (maybe 1) extra message (depending
>on how reads with C work) will be required and no extra transfers of a cache line.
>There is still no requirement to wait on memory since B can now pull the same trick I described for A above.

>It seems that the only time that the O state will make a significant difference
>is when we have a modified cache line L in chip C's cache backed by memory controlled
>by A requested by chip D. In other words O only makes a real difference when both
>the chip holding the modified cache line and the requesting chip don't control the backing memory.
>
>Hopefully this was a bit more clear but I will answer what you said below as well.

I see what you're saying, but it sounds fairly complex and much more work than it's worth.

[snip]

>We were talking about a situation where each chip has an inbuilt memory controller.
>I'm just saying that the only way that any other chips (B,C,D) in the system know
>about the state of a memory bank A controlled by the integrated memory controller
>on chip A is through the HT/CSI links between chip A and B,C,D. In terms of my
>diagram above I merely mean that B,C, D can't listen in on the double line between chip A and memory bank A.

Ah, ok.

>>>Now when the cache coherency protocol says that a cache line must be written back
>>>to memory it doesn't actually care if the line is 'really' >stored in the actual
>>>memory bank, only that it APPEARS to be so stored, i.e., >the memory controller could implement it's own cache.
>>
>>That's not a cache, it's a buffer. But sure, you could buffer the writes - you
>>just need to make sure that if you lose power you don't have any problems.
>
>It's an integrated memory controller so if the chip loses power we probably lost
>the information in memory anyway (is this not true in some configurations?).

Fair enough. My point is that you need to be able to handle any conceivable corner case (uncorrectable error in a cache line, read or write shootdowns, etc.).

>Also it's a cache if it serves future read requests out of it and lets write requests
>change items that haven't yet been written out to real memory but this is irrelevant.

>>>Thus presumably a chip that needs to 'write' a cache line >to memory it controls
>>>doesn't need to send any messages or do anything but >remember that this cache line
>>>has been 'written' to memory.
>>
>>Where do you want to store that information? In the memory controller, in the chip, etc.?

So you would have add yet another status bit the 'supposed to be written to memory' bit, that is distinct from the dirty bit?

>IN THE CACHE LINE!! The original question was: Does having an OWNED state in the
>cache coherency protocol make a noticeable performance difference. May claim is
>that no because even in a MESIF protocol chips with integrated memory controllers
>can duplicate the effect in the vast majority of the time.
>
>In other words WHEN A CHIP USING THE MESIF PROTOCOL CONTROLS THE MEMORY BACKING
>A CACHE LINE IT CAN ACT AS IF IT HAD THE MEMORY IN AN O STATE.

Here's the catch though - the memory controller controls the memory. It doesn't control the cache, and you don't want to introduce any dependencies between the two.

>>>So long as every read request by another chip on
>>>that memory location reflects the modified value everything >is hunky dory.
>>
>>Sure. The problem is not the common case though, it's probably in handling exceptional cases.
>
>This is just a premise to indicate what I'm saying next.
>
>
>>>Thus
>>>since MOST logical writes to memory that the O state would >eliminate don't require
>>>any PHYSICAL writes to memory it doesn't do much for >efficiency.
>>
>>Um, so write back buffers have to write to memory eventually. You don't eliminate
>>the write, you just defer it in your system.
>
>YES! DOING EXACTLY THE SAME THING ALLOWING THE O STATE WOULD.

No, you don't get it. The O state actually ELIMINATES THE WRITE. Let me give you a concrete example:

CPU0 writes a cache line
CPU1,2 ask for a shared copy, CPU 0 has it in O, others in S
CPU1,2 read cache line
CPU 0 writes again and invalidates CPU1,2, leaving cache line in M state
CPU1,2 ask for a shared copy, CPU 0 has it in O, others in S

Repeat...

Now I don't know how often this happens, but this sequence only requires a single write back at the end. Under the MESIF system, you'd actually have to write it back for every iteration. So you actually could save quite a few writes with an O state.

DK
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/28 01:16 AM
  New article - The Common System Interface: Intel's Future InterconnectVincent Diepeveen2007/08/28 05:02 AM
  New article - The Common System Interface: Intel's Future InterconnectRichard Cownie2007/08/28 10:28 AM
    New article - The Common System Interface: Intel's Future InterconnectVincent Diepeveen2007/08/31 11:44 AM
      New article - The Common System Interface: Intel's Future InterconnectRichard Cownie2007/08/31 08:53 PM
        New article - The Common System Interface: Intel's Future InterconnectVincent Diepeveen2007/09/01 02:21 AM
          Adding layers can simplify designPaul A. Clayton2007/09/01 07:39 AM
          New article - The Common System Interface: Intel's Future InterconnectMichael S2007/09/02 02:25 AM
        New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/14 12:47 PM
    New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/09/14 08:47 PM
  New article - The Common System Interface: Intel's Future InterconnectPaul2007/08/28 11:04 AM
    New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/28 12:43 PM
      New article - The Common System Interface: Intel's Future InterconnectJoe Chang2007/08/28 06:17 PM
        New article - The Common System Interface: Intel's Future InterconnectJoe Chang2007/08/29 04:27 PM
  Thanks for the workWouter Tinus2007/08/28 12:33 PM
    Thanks for the workmac2007/08/29 12:44 PM
  New article - The Common System Interface: Intel's Future InterconnectHerbert Hum2007/08/28 01:22 PM
    ThanksDavid Kanter2007/08/28 04:13 PM
  Many thanks, very, very interesting! (NT)Cameron Jack2007/08/29 01:51 AM
  very nice article + memory ctrl integrationMarcin Niewiadomski2007/08/29 11:46 AM
    very nice article + memory ctrl integrationDavid Kanter2007/09/14 08:50 PM
      very nice article + memory ctrl integrationMarcin Niewiadomski2007/09/16 08:48 PM
  Coherency: Forwarding and OwnedPeter Gerdes2007/08/29 02:11 PM
    Coherency: Forwarding and OwnedDavid Kanter2007/08/29 06:29 PM
      Coherency: Forwarding and Ownednick2007/08/29 07:03 PM
        Coherency: Forwarding and OwnedDavid Kanter2007/08/29 11:08 PM
      Coherency: Forwarding and OwnedMichael S2007/08/30 01:17 AM
        Coherency: Forwarding and OwnedDavid Kanter2007/08/30 07:31 AM
      Coherency: Forwarding and OwnedPeter Gerdes2007/08/30 11:46 AM
        Coherency: Forwarding and OwnedDavid Kanter2007/08/30 01:46 PM
          Coherency: Forwarding and OwnedPeter Gerdes2007/08/30 07:03 PM
            Coherency: Forwarding and OwnedDavid Kanter2007/09/14 08:44 PM
              Node Interleaveunknown2007/09/15 03:14 AM
                Node InterleaveDavid Kanter2007/09/15 07:50 AM
                  Node InterleaveHoward Chu2007/09/16 12:14 PM
              Coherency: Forwarding and OwnedPeter Gerdes2007/09/16 12:50 PM
                Coherency: Forwarding and OwnedDavid Kanter2007/09/16 04:34 PM
                  Coherency: Forwarding and OwnedEduardoS2007/09/16 04:52 PM
                  Coherency: Forwarding and OwnedJonathan Kang2007/09/17 05:16 AM
                  Coherency: Forwarding and OwnedMatthias2007/09/17 06:59 AM
                    Coherency: Forwarding and Owned - additionMatthias2007/09/17 07:01 AM
                Coherency: Forwarding and Ownedanonymous2007/09/17 09:15 AM
                  Coherency: Forwarding and OwnedPeter Gerdes2007/09/17 12:44 PM
  New article - The Common System Interface: Intel's Future InterconnectMr. Camel2007/08/30 03:16 PM
    New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 01:11 AM
      New article - The Common System Interface: Intel's Future InterconnectMr. Camel2007/08/31 03:13 AM
        New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 03:24 AM
        New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/31 05:39 AM
          New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 06:53 AM
      New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/31 05:41 AM
        New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 06:36 AM
          New article - The Common System Interface: Intel's Future InterconnectMr. Camel2007/08/31 08:36 AM
  Thanks and excellent work!Jack A.2007/08/30 07:41 PM
  Lamport's TLAKonrad Schwarz2007/09/02 01:57 AM
    Lamport's TLADavid Kanter2007/09/02 07:55 PM
    Lamport's TLABrannon2007/09/03 07:12 AM
      Lamport's TLAKonrad Schwarz2007/09/18 10:21 AM
        Lamport's TLABrannon2007/09/18 01:58 PM
  New article - The Common System Interface: Intel's Future InterconnectJosé Javier Zarate2007/09/09 04:01 PM
  New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/16 05:42 PM
    Remote prefetchDavid Kanter2007/09/17 07:51 AM
  New article - The Common System Interface: Intel's Future InterconnectJigal2007/09/22 02:39 PM
    New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/09/22 04:35 PM
      New article - The Common System Interface: Intel's Future Interconnect8B/10B Latency2007/09/22 06:16 PM
        New article - The Common System Interface: Intel's Future Interconnectanon2007/09/22 08:05 PM
          New article - The Common System Interface: Intel's Future InterconnectDavid W. Hess2007/09/22 08:50 PM
            Clocking in CSIDavid Kanter2007/09/23 08:46 AM
              Hypertransport 3 AC CouplingDavid W. Hess2007/09/23 09:32 AM
                Hypertransport 3 AC Couplinganon2007/09/23 09:53 AM
                  Clocking lanesDavid Kanter2007/09/23 10:51 AM
        New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/09/22 08:34 PM
          New article - The Common System Interface: Intel's Future InterconnectDavid W. Hess2007/09/22 09:10 PM
            New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/25 07:15 AM
          New article - The Common System Interface: Intel's Future InterconnectMichael S2007/09/23 01:06 AM
            New article - The Common System Interface: Intel's Future InterconnectDavid W. Hess2007/09/23 03:41 AM
            Serialization delayDavid Kanter2007/09/23 08:57 AM
              Serialization delayMichael S2007/09/23 11:20 AM
                Serialization delayDavid Kanter2007/09/23 12:43 PM
                  Serialization delayMichael S2007/09/24 12:40 AM
                    Serialization delayMichael S2007/09/24 04:28 AM
                    Serialization delayAaron Spink2007/09/24 12:19 PM
                      Serialization delayMichael S2007/09/25 03:38 AM
                        Serialization delayJonathan Kang2007/09/25 08:10 AM
                          Serialization delayDavid W. Hess2007/09/26 12:22 AM
                        Serialization delayAaron Spink2007/09/25 12:13 PM
                          Thank you (NT)Michael S2007/09/25 12:53 PM
                Serialization delayJonathan Kang2007/09/25 07:26 AM
                  Serialization delayMichael S2007/09/25 01:57 PM
                    Serialization delayJonathan Kang2007/09/26 05:24 AM
                      Serialization delayDavid W. Hess2007/09/26 06:39 AM
                        Serialization delayJonathan Kang2007/09/26 09:56 AM
                          Serialization delayDavid W. Hess2007/09/27 02:21 AM
                            Serialization delayJonathan Kang2007/09/27 04:36 AM
                              Serialization delayDavid W. Hess2007/09/27 05:31 PM
                      Serialization delayrwessel2007/09/26 01:26 PM
                        Serialization delayJonathan Kang2007/09/27 07:16 AM
                          Serialization delayrwessel2007/09/27 12:20 PM
                            Serialization delayJonathan Kang2007/09/28 04:38 AM
                              Serialization delayrwessel2007/09/28 01:00 PM
                                Serialization delayJonathan Kang2007/10/01 07:07 AM
                                  Cache coherent latencyDavid Kanter2007/10/01 07:20 AM
                                    Cache coherent latencyblaine2007/10/01 10:36 AM
                                      Critical word first on coherent interconnectsDavid Kanter2007/10/01 11:10 AM
                                        Does ccHT do critical word first?blaine2007/10/02 07:10 AM
                                    Cache coherent latencyJonathan Kang2007/10/01 12:34 PM
                                      Cache coherent latencyDavid Kanter2007/10/01 01:13 PM
                      Serialization delayMichael S2007/09/28 04:32 AM
                        Serialization delayanonymous2007/09/28 10:25 AM
                          Serialization delayMichael S2007/09/29 09:06 AM
        New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/25 07:05 AM
      New article - The Common System Interface: Intel's Future Interconnectjigal2007/09/23 12:37 PM
        CSI, PCI and HTDavid Kanter2007/09/23 12:46 PM
        New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/25 07:39 AM
          New article - The Common System Interface: Intel's Future Interconnectjigal2007/09/25 02:16 PM
            New article - The Common System Interface: Intel's Future InterconnectMichael S2007/09/26 03:14 AM
              New article - The Common System Interface: Intel's Future InterconnectAnonymous2007/09/26 09:41 AM
                New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/26 09:59 AM
            New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/26 05:48 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?