What are your ideas for a radically different CPU ISA + physical Arch?

By: rwessel (rwessel.delete@this.yahoo.com), March 21, 2021 4:02 am
Room: Moderated Discussions
Moritz (better.delete@this.not.tell) on March 20, 2021 5:21 am wrote:
> What if you could completely rethink the general processor concept?
> There are concepts that were without alternative in the days of little memory and few transistors:
> Sequential instructions by storage address and jumps based on that address
> Implicit dependency based on above principle
> Explicit naming of storage place rather than data item
> Explicit caching into registers
> Implicit addressing of registers
> Mixing of memory, float, integer instructions in one instruction stream
> that must be analyzed to remove the assumed sequentiallity.
> The ISA used to represent the physical architecture, today that
> is no longer the case in high performance microprocessors.
> The data modifies the program flow at run-time, instead of explicitly generating the data stream
> that reaches the execution units. The CPU steps through the program issuing the data to EUs instead
> of the program explicitly generating multiple data streams with synchronization markers.
> ... and many other implications that are so "natural" to us that we can not see/name them. As usual
> we can not even question the ways, because we are so used to them. There are infinite bad ways of doing
> it, but some of those forced/obvious (legacy) design decisions of the past might no longer be that
> necessary/without alternative. Some ways that seem cumbersome and wasteful might on second thought
> turn out to be hard on the human, but open new ways to the compiler, RTE, OS, CPU removing as much
> complexity as they add, but increasing throughput or energy efficiency beyond the current limit.


To a large extent none of that matters. Instruction oriented ISA (mis)features at best/worst are going to impact performance and power consumption by a few tens of percent, most of the time considerably less than that. Even within an ISA it's not that hard to deprecate a particularly poor feature - just punt it to slow microcode, and people will stop using it. As fewer people use it, punt it to even slower microcode, with essentially no implementation overhead.

We will get some increases in performances as chip technology improves, but we'll largely get that anyway, and I rather doubt that many people are expecting large increases there. We've managed what, a factor of ten in single core integer performance the last 15 years? And half of that from compilers painfully clawing out a bit of autovectorization on semi-hacked benchmarks. While spending something like 100 times as many transistor per core?

Let's quote Gene Amdahl (1967):


For over a decade prophets have voiced the contention that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution. Variously the proper direction has been pointed out as general purpose computers with a generalized interconnection of memories, or as specialized computers with geometrically related memory interconnections and controlled by one or more instruction streams.

Demonstration is made of the continued validity of the single processor approach and of the weaknesses of the multiple processor approach in terms of application to real problems and their attendant irregularities.


We've now reached the point that the "continued validity of the single processor approach" has been pretty solidly demonstrated to be false.

Which means that if there's any real hope for large increases in performance, it has to come form parallelism. We have mechanisms for some EP problems now (GPGPUs, vectors), and for EP-ish problems with limited communication between threads (standard multiple processors, to clusters, for code requiring really limited communication).

But if you're fortunate enough to have an EP problem, a few tens of percent difference in single core performance mostly doesn't matter anyway.

Of course we've been failing, pretty miserably (outside some limited areas), to generally use parallelism for better than four decades now. For the data-parallel approaches (vectors, etc.), it's even worse.

One of the biggest problems with multi-threaded code is the very high overhead of IPC. It make it largely impossible to split off small units of work. It make it really hard to split off medium sized units of work. And if all you can actually split off is large units of work, we're back to handling only EP problems, except for thing where we can justify heroics.

Fundamentally there's literally no room in any of that for the OS. Similar to how software TLB fills are a really bad idea, CPUs need to handle scheduling, intercontext calls, synchronization objects, interrupts and the like, *without* the OS getting involved (the OS would obviously be involved in setup). Thus CPUs need to hold multiple contexts (think a "TLB" for contexts), and be able to switch between them rapidly (IOW, SMT). A signal or message to a blocked thread whose context is held by the current core should dispatch in a couple of cycles; across the system, in about the time a cache coherency event needs to traverse the network). At the very small level, something like microthreads, with dispatch and synchronization times on the order of a cycle. It should be reasonable to throw a fork-microthread/wait-microthread around a dozen instructions).

Many sorts of I/O need to be possible at the user level, which means something like IOMMUs, smart enough that the OS can set it up for a process, and then avoid getting involved in the actual I/Os (interrupt/signal delivery from the I/O device shouldn't involve the OS either). That doesn't preclude devices that aren't capable of working in that environment, but handling that would be no worse than what we have now.

On the other hand it's not clear that this doesn't fall right back into the "magic compiler" trap, but parallelism is the only thing that might save us.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
What are your ideas for a radically different CPU ISA + physical Arch?Moritz2021/03/20 04:21 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Stanislav Shwartsman2021/03/20 05:22 AM
    I like the analysis of current arch presentedMoritz2021/03/20 09:13 AM
    Did you read this old article?Michael S2021/03/21 01:12 AM
  Deliver programs in IRHugo Décharnes2021/03/20 06:34 AM
    Java bytecode and Wasm exist, why invent something else? (NT)Foo_2021/03/20 07:01 AM
      Java bytecode and Wasm exist, why invent something else?Hugo Décharnes2021/03/20 07:55 AM
        Java bytecode and Wasm exist, why invent something else?Foo_2021/03/20 09:50 AM
          Java bytecode and Wasm exist, why invent something else?Hugo Décharnes2021/03/20 11:40 AM
            Java bytecode and Wasm exist, why invent something else?Foo_2021/03/20 03:54 PM
              It's called source code, no?anonymou52021/03/20 11:43 PM
                It's called source code, no?Foo_2021/03/21 04:07 AM
                Thoughts on software distribution formatsPaul A. Clayton2021/03/22 12:45 PM
    Deliver programs in IRJames2021/03/20 10:24 AM
      Deliver programs in IRHugo Décharnes2021/03/20 11:28 AM
        Deliver programs in IRHugo Décharnes2021/03/20 11:36 AM
    Deliver programs in IRLinus Torvalds2021/03/20 12:20 PM
      Deliver programs in IRHugo Décharnes2021/03/20 12:51 PM
      I'd like to be able to NOT specify order for some things ...Mark Roulo2021/03/20 04:49 PM
        I'd like to be able to NOT specify order for some things ...Jukka Larja2021/03/20 11:26 PM
          NOT (unintentionally) specify orderMoritz2021/03/21 05:00 AM
            NOT (unintentionally) specify orderJukka Larja2021/03/22 06:11 AM
              NOT (unintentionally) specify orderMoritz2021/03/22 11:40 AM
                NOT (unintentionally) specify orderJukka Larja2021/03/23 05:26 AM
          I'd like to be able to NOT specify order for some things ...Mark Roulo2021/03/21 08:47 AM
            I'd like to be able to NOT specify order for some things ...Victor Alander2021/03/21 04:14 PM
      Next architecture will start with MLwumpus2021/03/21 11:24 AM
        Next architecture will start with MLLinus Torvalds2021/03/21 01:38 PM
          Maybe SQL was the better example for general purpose machineswumpus2021/03/22 07:33 AM
            Maybe SQL was the better example for general purpose machinesanon2021/03/22 08:10 AM
        Next architecture will start with MLML will move to PIM2021/03/22 02:51 AM
    Deliver programs in IRanon2021/03/21 02:22 AM
      Deliver programs in IRanon22021/03/21 03:52 AM
        Deliver programs in IRrwessel2021/03/21 04:05 AM
          Deliver programs in IRanon22021/03/21 06:08 PM
            Deliver programs in IRrwessel2021/03/21 09:47 PM
              Deliver programs in IRdmcq2021/03/22 03:33 AM
                Deliver programs in IRrwessel2021/03/22 05:27 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Veedrac2021/03/20 10:27 AM
    Cray MTAanon2021/03/20 05:04 PM
      Cray MTAChester2021/03/20 06:54 PM
        Cray MTAVeedrac2021/03/21 12:33 AM
          Cray MTAnoone2021/03/21 08:15 AM
            Cray MTAVeedrac2021/03/21 09:54 AM
    monolithic 3Dwumpus2021/03/21 11:50 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Anon2021/03/20 11:06 PM
  What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 04:02 AM
  What are your ideas for a radically different CPU ISA + physical Arch?juanrga2021/03/21 04:46 AM
  Summery so farMoritz2021/03/21 08:45 AM
    Summery so farrwessel2021/03/21 10:23 AM
      not staticMoritz2021/03/26 09:12 AM
        Dynamic meta instruction encoding for instruction window compressionMoritz2021/03/28 02:28 AM
          redistributing the work between static compiler, dynamic compiler, CPUMoritz2021/04/05 02:21 AM
            redistributing the work between static compiler, dynamic compiler, CPUdmcq2021/04/05 08:27 AM
    Summery so farAnon2021/03/21 07:53 PM
  What are your ideas for a radically different CPU ISA + physical Arch?blaine2021/03/21 09:10 AM
    What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 10:26 AM
      What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 10:34 AM
        What are your ideas for a radically different CPU ISA + physical Arch?blaine2021/03/21 11:55 AM
          What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 12:31 PM
      What are your ideas for a radically different CPU ISA + physical Arch?gallier22021/03/21 11:49 PM
  What are your ideas for a radically different CPU ISA + physical Arch?dmcq2021/03/21 02:50 PM
  Microthread/low IPCEtienne Lorrain2021/03/22 02:22 AM
    Microthread/low IPCdmcq2021/03/22 03:24 AM
      Microthread/low IPCEtienne Lorrain2021/03/22 05:10 AM
        Microthread/low IPCdmcq2021/03/22 07:24 AM
    Microthread/low IPCdmcq2021/03/22 03:53 AM
      Microthread/low IPCEtienne Lorrain2021/03/22 04:46 AM
      Microthread/low IPCAnon2021/03/22 04:47 AM
    Microthread/low IPCHeikki Kultala2021/03/22 04:47 PM
      Microthread/low IPCEtienne Lorrain2021/03/23 02:36 AM
        Microthread/low IPCNyan2021/03/24 02:00 AM
          Microthread/low IPCEtienne Lorrain2021/03/24 03:23 AM
      Microthread/low IPCAnon2021/03/23 07:16 AM
        Microthread/low IPCgai2021/03/23 08:37 AM
          Microthread/low IPCAnon2021/03/23 09:17 AM
            Microthread/low IPCdmcq2021/03/23 11:42 AM
  Have you looked at "The Mill CPU" project? (nt)Anon C2021/03/22 05:21 AM
    Have you looked at "The Mill CPU" project? (nt)Moritz2021/03/22 11:13 AM
      Have you looked at "The Mill CPU" project? (nt)Andrew Clough2021/03/22 03:27 PM
        The Mill = vaporwareRichardC2021/03/23 11:47 AM
          The Mill = vaporwareMichael S2021/03/23 12:58 PM
          The Mill = vaporwareCarson2021/03/23 05:17 PM
          The Mill = doomed but interestingAndrew Clough2021/03/24 07:06 AM
            Solution in search of a problemwumpus2021/03/24 07:52 AM
              Solution in search of a problemdmcq2021/03/24 09:22 AM
          never-ware != vaporware (at least in connotation)Paul A. Clayton2021/03/24 09:37 AM
  What are your ideas for a radically different CPU ISA + physical Arch?anonini2021/03/22 07:28 AM
    microcode that can combine instructionMoritz2021/03/22 11:26 AM
  What are your ideas for a radically different CPU ISA + physical Arch?anony2021/03/22 09:16 AM
    Totally clueless.Heikki Kultala2021/03/22 04:53 PM
  Hierarchical instruction setHeikki Kultala2021/03/22 05:52 PM
    Hierarchical instruction setVeedrac2021/03/23 02:49 AM
      Hierarchical instruction setHeikki Kultala2021/03/23 05:46 AM
        Hierarchical instruction setEtienne Lorrain2021/03/23 06:16 AM
          microthreads on OS call/exceptionHeikki Kultala2021/03/23 06:34 AM
        Hierarchical instruction setVeedrac2021/03/23 08:31 AM
          Hierarchical instruction setEtienne Lorrain2021/03/24 12:13 AM
            Hierarchical instruction setVeedrac2021/03/24 06:11 AM
    Hierarchical instruction setAnon2021/03/23 07:39 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Paul A. Clayton2021/03/26 07:21 AM
    What are your ideas for a radically different CPU ISA + physical Arch?wumpus2021/03/26 08:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?