Hierarchical instruction set

By: Anon (no.delete@this.spam.com), March 23, 2021 8:39 am
Room: Moderated Discussions
Heikki Kultala (heikki.kultal.a.delete@this.gmail.com) on March 22, 2021 6:52 pm wrote:
> Lowest level or the hierarchy is single instruction, which reads data from
> internal bus/pipeline register, one register or immediate parameter, and produces
> value in the same internal bus/pipeline register than what it read
>
> Second level hierarchy is a serial bypass-bundle, something like 0-4 instructions. The bundle
> header contains one register read, number of instructions in the bundle, and target register
> index where the value from the (last instruction of the) bundle is written to. 0-instruction
> bundle is just a move. No interrupts/exceptions or side effects may happen inside the bundle,
> only after it/between them. Whole bundle practically executes atomically.
>
> OoOE can be done freely between the bundles, theoretically
> bundles could also be split by the HW dynamically.
>
> The next level hierarchy is a basic block. The jump do not have to be the last instruction of the basic
> block, for example the header for the basic block of a loop body may say that the basic block contains
> 4 bypass bundles and then branch can be already in the first one, but as the size of the basic block
> is 4 bundles, all the instructions in all of those 4 bundles get executed. The basic block header may
> also contain a loop count (either immediate or index to a register containing it). In this case no branch
> instructions are needed inside the loop and the basic block is still executed N times.

My mental model of a ideal ISA is very similar to what you describe.

> The next level of the hierarchy is microthreading, which is done fully on hardware, no SW overhead of
> starting threads more than single instrucion for fork and single instruction for join. All the microthreads
> have same virtual memory mapping, and there are just one instruction which starts another microthread,
> returning odd value to one thread and even value to another thread, and couple of different instructions
> to join thread. Some of these thread join instructions also perform some reduce operation on single-register
> return value data from both micro-threads, for example min, max, or add.
>
> The implementation is always free to execute the microthreads sequentially (common case if all our
> hardware microthreads are already in use, for example started by outer level function); programmer
> can write his code/compiler can compile the code like he/it has infinite amount of microthreads
> available. As the bundles execute atomically, different microthreads can still do things like incrementing
> the same counter in memory, but as they are allowed to execute sequentially, they are not allowed
> to wait data from one another microthread because that might cause a deadlock.
>
> This kind of microthreading should be VERY fast for creating new threads and joining existing ones, as
> it can be done fully on hardware: If there is a free HW microthread available, the ufork instruction just
> sets the PC and RAT for it(both threads get the same register contents) and activates it, if there are
> no, the beginning instruction address of the microthread is and GPR contents is put into (HW) queue and
> no new thread is started yet. When some thread exits, new one is taken from the queue and executed.
>
> The HW is free to even start executing the microthreads in SIMT way.
>
> As there is no limit how many waiting microthreads there may be, there is a hardware-based stack which can spill
> the PC and GPRs of a waiting microthread into memory into special protected memory location allocated for it
> by the OS. The only thing the OS does is allocating space for the stack. The same protected stack might also
> handle function return addresses so that those cannot be overwritten by normal load/store operations.

Here instead of a full hardware stack/queue I was imagining that the fork instruction would specify a memory location for the new microthread context data including a field for a pointer to the next microthread in the queue (or previous, if microthreads are FILO) and the hardware itself only have to specify two register to control this, a first and last item of the queue.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
What are your ideas for a radically different CPU ISA + physical Arch?Moritz2021/03/20 05:21 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Stanislav Shwartsman2021/03/20 06:22 AM
    I like the analysis of current arch presentedMoritz2021/03/20 10:13 AM
    Did you read this old article?Michael S2021/03/21 02:12 AM
  Deliver programs in IRHugo Décharnes2021/03/20 07:34 AM
    Java bytecode and Wasm exist, why invent something else? (NT)Foo_2021/03/20 08:01 AM
      Java bytecode and Wasm exist, why invent something else?Hugo Décharnes2021/03/20 08:55 AM
        Java bytecode and Wasm exist, why invent something else?Foo_2021/03/20 10:50 AM
          Java bytecode and Wasm exist, why invent something else?Hugo Décharnes2021/03/20 12:40 PM
            Java bytecode and Wasm exist, why invent something else?Foo_2021/03/20 04:54 PM
              It's called source code, no?anonymou52021/03/21 12:43 AM
                It's called source code, no?Foo_2021/03/21 05:07 AM
                Thoughts on software distribution formatsPaul A. Clayton2021/03/22 01:45 PM
    Deliver programs in IRJames2021/03/20 11:24 AM
      Deliver programs in IRHugo Décharnes2021/03/20 12:28 PM
        Deliver programs in IRHugo Décharnes2021/03/20 12:36 PM
    Deliver programs in IRLinus Torvalds2021/03/20 01:20 PM
      Deliver programs in IRHugo Décharnes2021/03/20 01:51 PM
      I'd like to be able to NOT specify order for some things ...Mark Roulo2021/03/20 05:49 PM
        I'd like to be able to NOT specify order for some things ...Jukka Larja2021/03/21 12:26 AM
          NOT (unintentionally) specify orderMoritz2021/03/21 06:00 AM
            NOT (unintentionally) specify orderJukka Larja2021/03/22 07:11 AM
              NOT (unintentionally) specify orderMoritz2021/03/22 12:40 PM
                NOT (unintentionally) specify orderJukka Larja2021/03/23 06:26 AM
          I'd like to be able to NOT specify order for some things ...Mark Roulo2021/03/21 09:47 AM
            I'd like to be able to NOT specify order for some things ...Victor Alander2021/03/21 05:14 PM
      Next architecture will start with MLwumpus2021/03/21 12:24 PM
        Next architecture will start with MLLinus Torvalds2021/03/21 02:38 PM
          Maybe SQL was the better example for general purpose machineswumpus2021/03/22 08:33 AM
            Maybe SQL was the better example for general purpose machinesanon2021/03/22 09:10 AM
        Next architecture will start with MLML will move to PIM2021/03/22 03:51 AM
    Deliver programs in IRanon2021/03/21 03:22 AM
      Deliver programs in IRanon22021/03/21 04:52 AM
        Deliver programs in IRrwessel2021/03/21 05:05 AM
          Deliver programs in IRanon22021/03/21 07:08 PM
            Deliver programs in IRrwessel2021/03/21 10:47 PM
              Deliver programs in IRdmcq2021/03/22 04:33 AM
                Deliver programs in IRrwessel2021/03/22 06:27 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Veedrac2021/03/20 11:27 AM
    Cray MTAanon2021/03/20 06:04 PM
      Cray MTAChester2021/03/20 07:54 PM
        Cray MTAVeedrac2021/03/21 01:33 AM
          Cray MTAnoone2021/03/21 09:15 AM
            Cray MTAVeedrac2021/03/21 10:54 AM
    monolithic 3Dwumpus2021/03/21 12:50 PM
  What are your ideas for a radically different CPU ISA + physical Arch?Anon2021/03/21 12:06 AM
  What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 05:02 AM
  What are your ideas for a radically different CPU ISA + physical Arch?juanrga2021/03/21 05:46 AM
  Summery so farMoritz2021/03/21 09:45 AM
    Summery so farrwessel2021/03/21 11:23 AM
      not staticMoritz2021/03/26 10:12 AM
        Dynamic meta instruction encoding for instruction window compressionMoritz2021/03/28 03:28 AM
          redistributing the work between static compiler, dynamic compiler, CPUMoritz2021/04/05 03:21 AM
            redistributing the work between static compiler, dynamic compiler, CPUdmcq2021/04/05 09:27 AM
    Summery so farAnon2021/03/21 08:53 PM
  What are your ideas for a radically different CPU ISA + physical Arch?blaine2021/03/21 10:10 AM
    What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 11:26 AM
      What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 11:34 AM
        What are your ideas for a radically different CPU ISA + physical Arch?blaine2021/03/21 12:55 PM
          What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 01:31 PM
      What are your ideas for a radically different CPU ISA + physical Arch?gallier22021/03/22 12:49 AM
  What are your ideas for a radically different CPU ISA + physical Arch?dmcq2021/03/21 03:50 PM
  Microthread/low IPCEtienne Lorrain2021/03/22 03:22 AM
    Microthread/low IPCdmcq2021/03/22 04:24 AM
      Microthread/low IPCEtienne Lorrain2021/03/22 06:10 AM
        Microthread/low IPCdmcq2021/03/22 08:24 AM
    Microthread/low IPCdmcq2021/03/22 04:53 AM
      Microthread/low IPCEtienne Lorrain2021/03/22 05:46 AM
      Microthread/low IPCAnon2021/03/22 05:47 AM
    Microthread/low IPCHeikki Kultala2021/03/22 05:47 PM
      Microthread/low IPCEtienne Lorrain2021/03/23 03:36 AM
        Microthread/low IPCNyan2021/03/24 03:00 AM
          Microthread/low IPCEtienne Lorrain2021/03/24 04:23 AM
      Microthread/low IPCAnon2021/03/23 08:16 AM
        Microthread/low IPCgai2021/03/23 09:37 AM
          Microthread/low IPCAnon2021/03/23 10:17 AM
            Microthread/low IPCdmcq2021/03/23 12:42 PM
  Have you looked at "The Mill CPU" project? (nt)Anon C2021/03/22 06:21 AM
    Have you looked at "The Mill CPU" project? (nt)Moritz2021/03/22 12:13 PM
      Have you looked at "The Mill CPU" project? (nt)Andrew Clough2021/03/22 04:27 PM
        The Mill = vaporwareRichardC2021/03/23 12:47 PM
          The Mill = vaporwareMichael S2021/03/23 01:58 PM
          The Mill = vaporwareCarson2021/03/23 06:17 PM
          The Mill = doomed but interestingAndrew Clough2021/03/24 08:06 AM
            Solution in search of a problemwumpus2021/03/24 08:52 AM
              Solution in search of a problemdmcq2021/03/24 10:22 AM
          never-ware != vaporware (at least in connotation)Paul A. Clayton2021/03/24 10:37 AM
  What are your ideas for a radically different CPU ISA + physical Arch?anonini2021/03/22 08:28 AM
    microcode that can combine instructionMoritz2021/03/22 12:26 PM
  What are your ideas for a radically different CPU ISA + physical Arch?anony2021/03/22 10:16 AM
    Totally clueless.Heikki Kultala2021/03/22 05:53 PM
  Hierarchical instruction setHeikki Kultala2021/03/22 06:52 PM
    Hierarchical instruction setVeedrac2021/03/23 03:49 AM
      Hierarchical instruction setHeikki Kultala2021/03/23 06:46 AM
        Hierarchical instruction setEtienne Lorrain2021/03/23 07:16 AM
          microthreads on OS call/exceptionHeikki Kultala2021/03/23 07:34 AM
        Hierarchical instruction setVeedrac2021/03/23 09:31 AM
          Hierarchical instruction setEtienne Lorrain2021/03/24 01:13 AM
            Hierarchical instruction setVeedrac2021/03/24 07:11 AM
    Hierarchical instruction setAnon2021/03/23 08:39 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Paul A. Clayton2021/03/26 08:21 AM
    What are your ideas for a radically different CPU ISA + physical Arch?wumpus2021/03/26 09:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊