Addressing in a NoC

By: --- (---.delete@this.redheron.com), May 17, 2022 11:02 pm
Room: Moderated Discussions
Can anyone answer the following question (or provide a reasonable quick summary, say a few pages but not a full book) on the state of the art:

Consider a modern SoC with a NoC (so something like a desktop x86 or a decent phone chip).
We know, *in theory*, that every IP block communicates with every other IP block via the NoC. My question is specifically about how the addressing of these communications works (not other issues like coherency or arbitration). It may perhaps be helpful to draw analogies to (or point out differences from) PCIe.

So let's consider some cases.
I'm going to assume that, for all practical purposes, everything is memory mapped IO. Is that reasonable?

So let's suppose a CPU wishes to configure some other IP block. My model (tell me if I'm wrong) is
(a) the OS has created a page table entry that maps some particular (known to user software or at least a driver) virtual address to the physical address that is associated with that register.

CPU SW reads from (or writes to) that virtual address, it's translated to the physical address AND at the same time some special flags are attached to this load specifying that it is IO (so it mustn't be cached, it doesn't execute until it's non speculative, it is ordered relative to other device IO, etc etc).

The load bypasses the cache hits the NoC and then ... Profit!

OK, options
1. each switch in the NoC has something like a TLB that maps certain particular address ranges to particular hardware. So when a request comes in the first thing is to look up the address in that table. If we see the address corresponds to device X, some sort of alternate routing is used that bypasses L3, coherency, the memory controller and all that, and sends the request to device X.
Logically this seems correct, but I've never heard of such a table, and it seems problematic to configure giving how many switch/routing points there are in a NoC.

2. we decompose the problem to something less performant but hierarchical. So there's only one location, one router in the NoC, that knows how to handle IO. When a switch receives a request with an IO tag (or a Device Ordering tag, or whatever language the platform uses) it sends the packet to the IO switch. That one location knows enough to send this address range to PCIe bus 1, that address range to the south bridge, and so on. Then each of those handle it in their own way.
That kinda works but consider things like IPIs. I think to actually make it work, we have to slightly extend the addressing rule to something like
- an address is a 16+64b number
- normal addresses have the upper 16b as zero, and are routed as addresses
- device addresses have the upper 16b as non-zero (the other 64b can be whatever). The switching system (distributed, or at a central location) is responsible for mapping an initial MMIO physical address to an upper16b address.
This is kinda like routing on an ethernet based on either MAC or IP address depending on what you're trying to do, I guess.

-------------------

OK the above tries to lay out where I see difficulties in handling MMIO.
Now let's consider the case of a CPU simply requesting a line from "memory".
So the CPU asks the cache, the cache can't find it, it goes to L2, and eventually L2 submits a message to the NoC that presumably includes, among other things (requesting CPU, address).
OK, that message routes here to some L3 (based on a hash of the address), L3 says no data, it routes to memory controller, memory supplies the line, which now has to route back to the originator CPU.
And we face the same issue. Once again, how is the return response "addressed". Once again it seems like the initially simple claim that "*every* address on the NoC is a physical memory address" can't actually work; there has to be a kinda parallel addressing/routing machinery driven by some sort of abstract "IP Block ID".

So was my initial starting point incorrect? Is it better to say that, in fact, every request on the NoC, while it may CARRY a memory address as part of the request, the actual NoC address is some small (16bit or 8bit or so) identifier.

So in fact when a request goes out to "memory", what hits the NoC is a packet whose address is "memory system", and the NoC converts that into a sequence of routings (first to the appropriate L3, then to the appropriate memory controller). [So there's not precise layering, the NoC looks inside the packet to at least the physical memory address field, to make some decisions].

This seems plausible, but I'm still left wondering where EXACTLY the mapping to the target device NoC address happens when I make an MMIO request to do something like write to the register at address 0xabcd. As software all I know is I am writing to a virtual (to be translated to physical) address. Somewhere between that and the NoC a NoC device address has to be looked up. A natural place for this would be the TLB except I've never heard of such a thing, and there's way possibilities than can fit in a page table entry. So???

Anyone understand what I am asking and how it's done?
 Next Post in Thread >
TopicPosted ByDate
Addressing in a NoC---2022/05/17 11:02 PM
  Addressing in a NoCanon22022/05/18 12:32 AM
  Addressing in a NoCHugo Décharnes2022/05/18 04:08 AM
    Addressing in a NoC---2022/05/18 09:41 AM
      Addressing in a NoCHugo Décharnes2022/05/18 10:51 AM
        Addressing in a NoC---2022/05/18 02:43 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?