Microthread/low IPC

By: Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr), March 24, 2021 4:23 am
Room: Moderated Discussions
Nyan (nyan.delete@this.mailinator.com) on March 24, 2021 3:00 am wrote:
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 23, 2021 3:36 am wrote:
> > How do you explain then that average IPC can easily fall down below 1 on 100% CPU use,
> > like in "sudo perf stat /usr/bin/md5sum /usr/bin/md5sum" line "insn per cycle"?
> > It looks to me none of those hundreds of in-flight instructions are ready to execute,
> > probably instructions prerequisite not ready
>
> The md5sum executable is like 43KB on my system. For hashing 43KB, a lot of the time is probably spent on all
> sorts of cold cases: like disk I/O, context switching, branch misprediction, memory loading, overheads etc.
>
> With a much larger source, say `dd if=/dev/zero bs=1M count=512 | perf stat /usr/bin/md5sum`
> I get above 1 IPC as expected. Though it won't be much higher due to MD5 being a long dependency
> chain
- this limitation likely also affects the above 43KB case as well.
>
> On my system, `perf stat /usr/bin/sha1sum /usr/bin/sha1sum` gets
> above 1 IPC, noting that SHA1 has much better ILP than MD5.

The problem is not that current processors cannot do IPC of 4 for specific loads, it is that most of the time processors do not achieve at least 1 for standard loads.
Saying that you can hash zero much quicker is like telling word-processor users to only type one single letter a massive amount of time to show that their word-processor is very efficient.

If you have 43 KB to hash, and the data is already loaded from disk because that is also the binary you are executing, disk I/O and memory loading should be negligible. You should not have a lot of context switching on such a small workload, and if branch misprediction is such a problem then there is little reasons to have few hundreds instructions in flight.
If you need 43*1024 times the same treatment for the performance to stabilise, then execution will nearly never stabilise on most software (unless you are trying to calculate crypto-currencies).
sha1sum has just been chosen because it is easy to reproduce, most other smaller load (un-accelerated by custom processor hardware) would also do.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
What are your ideas for a radically different CPU ISA + physical Arch?Moritz2021/03/20 05:21 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Stanislav Shwartsman2021/03/20 06:22 AM
    I like the analysis of current arch presentedMoritz2021/03/20 10:13 AM
    Did you read this old article?Michael S2021/03/21 02:12 AM
  Deliver programs in IRHugo Décharnes2021/03/20 07:34 AM
    Java bytecode and Wasm exist, why invent something else? (NT)Foo_2021/03/20 08:01 AM
      Java bytecode and Wasm exist, why invent something else?Hugo Décharnes2021/03/20 08:55 AM
        Java bytecode and Wasm exist, why invent something else?Foo_2021/03/20 10:50 AM
          Java bytecode and Wasm exist, why invent something else?Hugo Décharnes2021/03/20 12:40 PM
            Java bytecode and Wasm exist, why invent something else?Foo_2021/03/20 04:54 PM
              It's called source code, no?anonymou52021/03/21 12:43 AM
                It's called source code, no?Foo_2021/03/21 05:07 AM
                Thoughts on software distribution formatsPaul A. Clayton2021/03/22 01:45 PM
    Deliver programs in IRJames2021/03/20 11:24 AM
      Deliver programs in IRHugo Décharnes2021/03/20 12:28 PM
        Deliver programs in IRHugo Décharnes2021/03/20 12:36 PM
    Deliver programs in IRLinus Torvalds2021/03/20 01:20 PM
      Deliver programs in IRHugo Décharnes2021/03/20 01:51 PM
      I'd like to be able to NOT specify order for some things ...Mark Roulo2021/03/20 05:49 PM
        I'd like to be able to NOT specify order for some things ...Jukka Larja2021/03/21 12:26 AM
          NOT (unintentionally) specify orderMoritz2021/03/21 06:00 AM
            NOT (unintentionally) specify orderJukka Larja2021/03/22 07:11 AM
              NOT (unintentionally) specify orderMoritz2021/03/22 12:40 PM
                NOT (unintentionally) specify orderJukka Larja2021/03/23 06:26 AM
          I'd like to be able to NOT specify order for some things ...Mark Roulo2021/03/21 09:47 AM
            I'd like to be able to NOT specify order for some things ...Victor Alander2021/03/21 05:14 PM
      Next architecture will start with MLwumpus2021/03/21 12:24 PM
        Next architecture will start with MLLinus Torvalds2021/03/21 02:38 PM
          Maybe SQL was the better example for general purpose machineswumpus2021/03/22 08:33 AM
            Maybe SQL was the better example for general purpose machinesanon2021/03/22 09:10 AM
        Next architecture will start with MLML will move to PIM2021/03/22 03:51 AM
    Deliver programs in IRanon2021/03/21 03:22 AM
      Deliver programs in IRanon22021/03/21 04:52 AM
        Deliver programs in IRrwessel2021/03/21 05:05 AM
          Deliver programs in IRanon22021/03/21 07:08 PM
            Deliver programs in IRrwessel2021/03/21 10:47 PM
              Deliver programs in IRdmcq2021/03/22 04:33 AM
                Deliver programs in IRrwessel2021/03/22 06:27 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Veedrac2021/03/20 11:27 AM
    Cray MTAanon2021/03/20 06:04 PM
      Cray MTAChester2021/03/20 07:54 PM
        Cray MTAVeedrac2021/03/21 01:33 AM
          Cray MTAnoone2021/03/21 09:15 AM
            Cray MTAVeedrac2021/03/21 10:54 AM
    monolithic 3Dwumpus2021/03/21 12:50 PM
  What are your ideas for a radically different CPU ISA + physical Arch?Anon2021/03/21 12:06 AM
  What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 05:02 AM
  What are your ideas for a radically different CPU ISA + physical Arch?juanrga2021/03/21 05:46 AM
  Summery so farMoritz2021/03/21 09:45 AM
    Summery so farrwessel2021/03/21 11:23 AM
      not staticMoritz2021/03/26 10:12 AM
        Dynamic meta instruction encoding for instruction window compressionMoritz2021/03/28 03:28 AM
          redistributing the work between static compiler, dynamic compiler, CPUMoritz2021/04/05 03:21 AM
            redistributing the work between static compiler, dynamic compiler, CPUdmcq2021/04/05 09:27 AM
    Summery so farAnon2021/03/21 08:53 PM
  What are your ideas for a radically different CPU ISA + physical Arch?blaine2021/03/21 10:10 AM
    What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 11:26 AM
      What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 11:34 AM
        What are your ideas for a radically different CPU ISA + physical Arch?blaine2021/03/21 12:55 PM
          What are your ideas for a radically different CPU ISA + physical Arch?rwessel2021/03/21 01:31 PM
      What are your ideas for a radically different CPU ISA + physical Arch?gallier22021/03/22 12:49 AM
  What are your ideas for a radically different CPU ISA + physical Arch?dmcq2021/03/21 03:50 PM
  Microthread/low IPCEtienne Lorrain2021/03/22 03:22 AM
    Microthread/low IPCdmcq2021/03/22 04:24 AM
      Microthread/low IPCEtienne Lorrain2021/03/22 06:10 AM
        Microthread/low IPCdmcq2021/03/22 08:24 AM
    Microthread/low IPCdmcq2021/03/22 04:53 AM
      Microthread/low IPCEtienne Lorrain2021/03/22 05:46 AM
      Microthread/low IPCAnon2021/03/22 05:47 AM
    Microthread/low IPCHeikki Kultala2021/03/22 05:47 PM
      Microthread/low IPCEtienne Lorrain2021/03/23 03:36 AM
        Microthread/low IPCNyan2021/03/24 03:00 AM
          Microthread/low IPCEtienne Lorrain2021/03/24 04:23 AM
      Microthread/low IPCAnon2021/03/23 08:16 AM
        Microthread/low IPCgai2021/03/23 09:37 AM
          Microthread/low IPCAnon2021/03/23 10:17 AM
            Microthread/low IPCdmcq2021/03/23 12:42 PM
  Have you looked at "The Mill CPU" project? (nt)Anon C2021/03/22 06:21 AM
    Have you looked at "The Mill CPU" project? (nt)Moritz2021/03/22 12:13 PM
      Have you looked at "The Mill CPU" project? (nt)Andrew Clough2021/03/22 04:27 PM
        The Mill = vaporwareRichardC2021/03/23 12:47 PM
          The Mill = vaporwareMichael S2021/03/23 01:58 PM
          The Mill = vaporwareCarson2021/03/23 06:17 PM
          The Mill = doomed but interestingAndrew Clough2021/03/24 08:06 AM
            Solution in search of a problemwumpus2021/03/24 08:52 AM
              Solution in search of a problemdmcq2021/03/24 10:22 AM
          never-ware != vaporware (at least in connotation)Paul A. Clayton2021/03/24 10:37 AM
  What are your ideas for a radically different CPU ISA + physical Arch?anonini2021/03/22 08:28 AM
    microcode that can combine instructionMoritz2021/03/22 12:26 PM
  What are your ideas for a radically different CPU ISA + physical Arch?anony2021/03/22 10:16 AM
    Totally clueless.Heikki Kultala2021/03/22 05:53 PM
  Hierarchical instruction setHeikki Kultala2021/03/22 06:52 PM
    Hierarchical instruction setVeedrac2021/03/23 03:49 AM
      Hierarchical instruction setHeikki Kultala2021/03/23 06:46 AM
        Hierarchical instruction setEtienne Lorrain2021/03/23 07:16 AM
          microthreads on OS call/exceptionHeikki Kultala2021/03/23 07:34 AM
        Hierarchical instruction setVeedrac2021/03/23 09:31 AM
          Hierarchical instruction setEtienne Lorrain2021/03/24 01:13 AM
            Hierarchical instruction setVeedrac2021/03/24 07:11 AM
    Hierarchical instruction setAnon2021/03/23 08:39 AM
  What are your ideas for a radically different CPU ISA + physical Arch?Paul A. Clayton2021/03/26 08:21 AM
    What are your ideas for a radically different CPU ISA + physical Arch?wumpus2021/03/26 09:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊