By: Nyan (nyan.delete@this.mailinator.com), March 24, 2021 3:00 am
Room: Moderated Discussions
Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 23, 2021 3:36 am wrote:
> How do you explain then that average IPC can easily fall down below 1 on 100% CPU use,
> like in "sudo perf stat /usr/bin/md5sum /usr/bin/md5sum" line "insn per cycle"?
> It looks to me none of those hundreds of in-flight instructions are ready to execute,
> probably instructions prerequisite not ready
The md5sum executable is like 43KB on my system. For hashing 43KB, a lot of the time is probably spent on all sorts of cold cases: like disk I/O, context switching, branch misprediction, memory loading, overheads etc.
With a much larger source, say `dd if=/dev/zero bs=1M count=512 | perf stat /usr/bin/md5sum` I get above 1 IPC as expected. Though it won't be much higher due to MD5 being a long dependency chain - this limitation likely also affects the above 43KB case as well.
On my system, `perf stat /usr/bin/sha1sum /usr/bin/sha1sum` gets above 1 IPC, noting that SHA1 has much better ILP than MD5.
> How do you explain then that average IPC can easily fall down below 1 on 100% CPU use,
> like in "sudo perf stat /usr/bin/md5sum /usr/bin/md5sum" line "insn per cycle"?
> It looks to me none of those hundreds of in-flight instructions are ready to execute,
> probably instructions prerequisite not ready
The md5sum executable is like 43KB on my system. For hashing 43KB, a lot of the time is probably spent on all sorts of cold cases: like disk I/O, context switching, branch misprediction, memory loading, overheads etc.
With a much larger source, say `dd if=/dev/zero bs=1M count=512 | perf stat /usr/bin/md5sum` I get above 1 IPC as expected. Though it won't be much higher due to MD5 being a long dependency chain - this limitation likely also affects the above 43KB case as well.
On my system, `perf stat /usr/bin/sha1sum /usr/bin/sha1sum` gets above 1 IPC, noting that SHA1 has much better ILP than MD5.