Coherency: Forwarding and Owned

Article: The Common System Interface: Intel's Future Interconnect
By: Peter Gerdes (truepath.delete@this.infiniteinjury.org), September 16, 2007 12:50 pm
Room: Moderated Discussions
Hey, thanks for your paitence. I end up using some caps below for emphasis, they aren't meant to express frustration.


David Kanter (dkanter@realworldtech.com) on 9/14/07 wrote:
---------------------------

>NUMA isn't a memory model - it's an implementation of the memory hierarchy. Something
>cannot be NUMA and appear to be UMA.

Yes, what I meant is that it is a cache coherent NUMA system. I was stupily using "looks UMA" to mean that user software can be programed as if it was a UMA design without sacrificing correctness. Since I'm clearly using the terminology poorly let me just draw out the configuration I'm talking about (what I believe is the multi-chip opteron config and the one that forthcoming multi-chip intel systems with CSI and integrated memory controllers will use)


Memory Bank A Memory Bank B
|| ||
|| ||
Chip A----------Chip B
| |
| |
| |
| |
| |
| |
| |
Chip C--------Chip D
|| ||
|| ||
Memory Bank C Memory Bank D

(pretend the line in the center links Chip A and Chip D)

Now the question being asked is when does MOESI cache coherency protocol offer an advantage over a MESIF protocol.

Well the benefit of the O state is that if chip A has a modified cache line (call it L) and both chip's B and C request to read that cache line then chip A can trasition L to the state O and pass it off to B and C without doing a write. My claim is that in the vast majority of cases the MESIF protocol can do exactly (or almost exactly) the same thing.

Why? So suppose that cache line L (in A's cache) corresponds to a memory location in bank A and that as above B and C request to both read that location. The MESIF protocol requires that A first write it's line out to memory and transitioning to the F state before passing it on to B and C. But SINCE THE MEMORY CONTROLLER THAT WOULD WRITE L TO MEMORY IS INTEGRATED INTO CHIP A NO ACTUAL WRITE HAS TO TAKE PLACE. In other words A just *immediately* tells the other chips that it has written L to memory and hands out the cache line to B and C making an *internal* mark to write L to memory before eliminating it from it's own cache. Correctness is guaranteed because the only way any other chip can read or write to the memory backing L is through chip A.

So what if L isn't backed by memory A controls? Let's suppose that L is backed by memory that B controls. Well in this case when B requests the cache line A already has to transfer that cache line over to B. Therefore (maybe 1) extra message (depending on how reads with C work) will be required and no extra transfers of a cache line. There is still no requirement to wait on memory since B can now pull the same trick I described for A above.

It seems that the only time that the O state will make a significant difference is when we have a modified cache line L in chip C's cache backed by memory controlled by A requested by chip D. In other words O only makes a real difference when both the chip holding the modified cache line and the requesting chip don't control the backing memory.

Hopefully this was a bit more clear but I will answer what you said below as well.




>
>It's very straight forward - if different regions of memory have different latencies
>then the system is NUMA. If all memory has the same latency, then the system is UMA. It cannot be both.
>
>>Thus
>>each chip has an exclusive connection to it's own memory >pool unseen by any other
>>chip.
>
>This makes no sense. In this situation, each memory controller would have to connect to multiple CPUs.


We were talking about a situation where each chip has an inbuilt memory controller. I'm just saying that the only way that any other chips (B,C,D) in the system know about the state of a memory bank A controlled by the integrated memory controller on chip A is through the HT/CSI links between chip A and B,C,D. In terms of my diagram above I merely mean that B,C, D can't listen in on the double line between chip A and memory bank A.


>>Now when the cache coherency protocol says that a cache line must be written back
>>to memory it doesn't actually care if the line is 'really' >stored in the actual
>>memory bank, only that it APPEARS to be so stored, i.e., >the memory controller could implement it's own cache.
>
>That's not a cache, it's a buffer. But sure, you could buffer the writes - you
>just need to make sure that if you lose power you don't have any problems.

It's an integrated memory controller so if the chip loses power we probably lost the information in memory anyway (is this not true in some configurations?).

Also it's a cache if it serves future read requests out of it and lets write requests change items that haven't yet been written out to real memory but this is irrelevant.

>
>>Thus presumably a chip that needs to 'write' a cache line >to memory it controls
>>doesn't need to send any messages or do anything but >remember that this cache line
>>has been 'written' to memory.
>
>Where do you want to store that information? In the memory controller, in the chip, etc.?

IN THE CACHE LINE!! The original question was: Does having an OWNED state in the cache coherency protocol make a noticeable performance difference. May claim is that no because even in a MESIF protocol chips with integrated memory controllers can duplicate the effect in the vast majority of the time.

In other words WHEN A CHIP USING THE MESIF PROTOCOL CONTROLS THE MEMORY BACKING A CACHE LINE IT CAN ACT AS IF IT HAD THE MEMORY IN AN O STATE.


>
>>So long as every read request by another chip on
>>that memory location reflects the modified value everything >is hunky dory.
>
>Sure. The problem is not the common case though, it's probably in handling exceptional cases.

This is just a premise to indicate what I'm saying next.

>
>>Thus
>>since MOST logical writes to memory that the O state would >eliminate don't require
>>any PHYSICAL writes to memory it doesn't do much for >efficiency.
>
>Um, so write back buffers have to write to memory eventually. You don't eliminate
>the write, you just defer it in your system.
>

YES! DOING EXACTLY THE SAME THING ALLOWING THE O STATE WOULD.


>>Supposing the protocol doesn't require sending the same cache line twice to a processor
>>that both controls that memory location and wants to read that cache line it will
>>be a very rare event that the lack of an O state will cause an extra PHYSICAL write
>>to memory. Sure for systems that hang all the memory off of one of the chips this
>>would be a loss but presumably the high performance systems would balance memory between the chips.
>
>I don't understand what you are saying here.
>
>>Sorry to keep pushing this issue but obviously I am missing >something and I'd like to figure out what it is.
>
>I think for starters you are confusing what NUMA and UMA mean, and how they are related to cache coherency.
>
>DK
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/28 01:16 AM
  New article - The Common System Interface: Intel's Future InterconnectVincent Diepeveen2007/08/28 05:02 AM
  New article - The Common System Interface: Intel's Future InterconnectRichard Cownie2007/08/28 10:28 AM
    New article - The Common System Interface: Intel's Future InterconnectVincent Diepeveen2007/08/31 11:44 AM
      New article - The Common System Interface: Intel's Future InterconnectRichard Cownie2007/08/31 08:53 PM
        New article - The Common System Interface: Intel's Future InterconnectVincent Diepeveen2007/09/01 02:21 AM
          Adding layers can simplify designPaul A. Clayton2007/09/01 07:39 AM
          New article - The Common System Interface: Intel's Future InterconnectMichael S2007/09/02 02:25 AM
        New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/14 12:47 PM
    New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/09/14 08:47 PM
  New article - The Common System Interface: Intel's Future InterconnectPaul2007/08/28 11:04 AM
    New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/28 12:43 PM
      New article - The Common System Interface: Intel's Future InterconnectJoe Chang2007/08/28 06:17 PM
        New article - The Common System Interface: Intel's Future InterconnectJoe Chang2007/08/29 04:27 PM
  Thanks for the workWouter Tinus2007/08/28 12:33 PM
    Thanks for the workmac2007/08/29 12:44 PM
  New article - The Common System Interface: Intel's Future InterconnectHerbert Hum2007/08/28 01:22 PM
    ThanksDavid Kanter2007/08/28 04:13 PM
  Many thanks, very, very interesting! (NT)Cameron Jack2007/08/29 01:51 AM
  very nice article + memory ctrl integrationMarcin Niewiadomski2007/08/29 11:46 AM
    very nice article + memory ctrl integrationDavid Kanter2007/09/14 08:50 PM
      very nice article + memory ctrl integrationMarcin Niewiadomski2007/09/16 08:48 PM
  Coherency: Forwarding and OwnedPeter Gerdes2007/08/29 02:11 PM
    Coherency: Forwarding and OwnedDavid Kanter2007/08/29 06:29 PM
      Coherency: Forwarding and Ownednick2007/08/29 07:03 PM
        Coherency: Forwarding and OwnedDavid Kanter2007/08/29 11:08 PM
      Coherency: Forwarding and OwnedMichael S2007/08/30 01:17 AM
        Coherency: Forwarding and OwnedDavid Kanter2007/08/30 07:31 AM
      Coherency: Forwarding and OwnedPeter Gerdes2007/08/30 11:46 AM
        Coherency: Forwarding and OwnedDavid Kanter2007/08/30 01:46 PM
          Coherency: Forwarding and OwnedPeter Gerdes2007/08/30 07:03 PM
            Coherency: Forwarding and OwnedDavid Kanter2007/09/14 08:44 PM
              Node Interleaveunknown2007/09/15 03:14 AM
                Node InterleaveDavid Kanter2007/09/15 07:50 AM
                  Node InterleaveHoward Chu2007/09/16 12:14 PM
              Coherency: Forwarding and OwnedPeter Gerdes2007/09/16 12:50 PM
                Coherency: Forwarding and OwnedDavid Kanter2007/09/16 04:34 PM
                  Coherency: Forwarding and OwnedEduardoS2007/09/16 04:52 PM
                  Coherency: Forwarding and OwnedJonathan Kang2007/09/17 05:16 AM
                  Coherency: Forwarding and OwnedMatthias2007/09/17 06:59 AM
                    Coherency: Forwarding and Owned - additionMatthias2007/09/17 07:01 AM
                Coherency: Forwarding and Ownedanonymous2007/09/17 09:15 AM
                  Coherency: Forwarding and OwnedPeter Gerdes2007/09/17 12:44 PM
  New article - The Common System Interface: Intel's Future InterconnectMr. Camel2007/08/30 03:16 PM
    New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 01:11 AM
      New article - The Common System Interface: Intel's Future InterconnectMr. Camel2007/08/31 03:13 AM
        New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 03:24 AM
        New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/31 05:39 AM
          New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 06:53 AM
      New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/08/31 05:41 AM
        New article - The Common System Interface: Intel's Future InterconnectMichael S2007/08/31 06:36 AM
          New article - The Common System Interface: Intel's Future InterconnectMr. Camel2007/08/31 08:36 AM
  Thanks and excellent work!Jack A.2007/08/30 07:41 PM
  Lamport's TLAKonrad Schwarz2007/09/02 01:57 AM
    Lamport's TLADavid Kanter2007/09/02 07:55 PM
    Lamport's TLABrannon2007/09/03 07:12 AM
      Lamport's TLAKonrad Schwarz2007/09/18 10:21 AM
        Lamport's TLABrannon2007/09/18 01:58 PM
  New article - The Common System Interface: Intel's Future InterconnectJosé Javier Zarate2007/09/09 04:01 PM
  New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/16 05:42 PM
    Remote prefetchDavid Kanter2007/09/17 07:51 AM
  New article - The Common System Interface: Intel's Future InterconnectJigal2007/09/22 02:39 PM
    New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/09/22 04:35 PM
      New article - The Common System Interface: Intel's Future Interconnect8B/10B Latency2007/09/22 06:16 PM
        New article - The Common System Interface: Intel's Future Interconnectanon2007/09/22 08:05 PM
          New article - The Common System Interface: Intel's Future InterconnectDavid W. Hess2007/09/22 08:50 PM
            Clocking in CSIDavid Kanter2007/09/23 08:46 AM
              Hypertransport 3 AC CouplingDavid W. Hess2007/09/23 09:32 AM
                Hypertransport 3 AC Couplinganon2007/09/23 09:53 AM
                  Clocking lanesDavid Kanter2007/09/23 10:51 AM
        New article - The Common System Interface: Intel's Future InterconnectDavid Kanter2007/09/22 08:34 PM
          New article - The Common System Interface: Intel's Future InterconnectDavid W. Hess2007/09/22 09:10 PM
            New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/25 07:15 AM
          New article - The Common System Interface: Intel's Future InterconnectMichael S2007/09/23 01:06 AM
            New article - The Common System Interface: Intel's Future InterconnectDavid W. Hess2007/09/23 03:41 AM
            Serialization delayDavid Kanter2007/09/23 08:57 AM
              Serialization delayMichael S2007/09/23 11:20 AM
                Serialization delayDavid Kanter2007/09/23 12:43 PM
                  Serialization delayMichael S2007/09/24 12:40 AM
                    Serialization delayMichael S2007/09/24 04:28 AM
                    Serialization delayAaron Spink2007/09/24 12:19 PM
                      Serialization delayMichael S2007/09/25 03:38 AM
                        Serialization delayJonathan Kang2007/09/25 08:10 AM
                          Serialization delayDavid W. Hess2007/09/26 12:22 AM
                        Serialization delayAaron Spink2007/09/25 12:13 PM
                          Thank you (NT)Michael S2007/09/25 12:53 PM
                Serialization delayJonathan Kang2007/09/25 07:26 AM
                  Serialization delayMichael S2007/09/25 01:57 PM
                    Serialization delayJonathan Kang2007/09/26 05:24 AM
                      Serialization delayDavid W. Hess2007/09/26 06:39 AM
                        Serialization delayJonathan Kang2007/09/26 09:56 AM
                          Serialization delayDavid W. Hess2007/09/27 02:21 AM
                            Serialization delayJonathan Kang2007/09/27 04:36 AM
                              Serialization delayDavid W. Hess2007/09/27 05:31 PM
                      Serialization delayrwessel2007/09/26 01:26 PM
                        Serialization delayJonathan Kang2007/09/27 07:16 AM
                          Serialization delayrwessel2007/09/27 12:20 PM
                            Serialization delayJonathan Kang2007/09/28 04:38 AM
                              Serialization delayrwessel2007/09/28 01:00 PM
                                Serialization delayJonathan Kang2007/10/01 07:07 AM
                                  Cache coherent latencyDavid Kanter2007/10/01 07:20 AM
                                    Cache coherent latencyblaine2007/10/01 10:36 AM
                                      Critical word first on coherent interconnectsDavid Kanter2007/10/01 11:10 AM
                                        Does ccHT do critical word first?blaine2007/10/02 07:10 AM
                                    Cache coherent latencyJonathan Kang2007/10/01 12:34 PM
                                      Cache coherent latencyDavid Kanter2007/10/01 01:13 PM
                      Serialization delayMichael S2007/09/28 04:32 AM
                        Serialization delayanonymous2007/09/28 10:25 AM
                          Serialization delayMichael S2007/09/29 09:06 AM
        New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/25 07:05 AM
      New article - The Common System Interface: Intel's Future Interconnectjigal2007/09/23 12:37 PM
        CSI, PCI and HTDavid Kanter2007/09/23 12:46 PM
        New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/25 07:39 AM
          New article - The Common System Interface: Intel's Future Interconnectjigal2007/09/25 02:16 PM
            New article - The Common System Interface: Intel's Future InterconnectMichael S2007/09/26 03:14 AM
              New article - The Common System Interface: Intel's Future InterconnectAnonymous2007/09/26 09:41 AM
                New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/26 09:59 AM
            New article - The Common System Interface: Intel's Future InterconnectJonathan Kang2007/09/26 05:48 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?