By: Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr), March 22, 2021 5:46 am
Room: Moderated Discussions
dmcq (dmcq.delete@this.fano.co.uk) on March 22, 2021 4:53 am wrote:
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 22, 2021 3:22 am wrote:
> > Like other have pointed to, microthread could be used, I would
> > propose very simple form of it, without much modification.
> > I mean, usual software only have an IPC in between 0.5 and 2, far lower than the theoretical
> > maximum. Even hyper-threading cores seem to quite often have the two threads waiting.
> > IHMO we could have an explicit, hardware supported, background microthread. The OoO CPU would
> > reserve around 20 in-flight instructions for it (out of its 100 in-flight instructions).
> > So if the main program flow is stopped (waiting for memory reads), the background microthread would be run.
> > I would imagine that "background microthread" being very limited, not able to do system
> > call or change rings - but it would be sufficient to do things like pre-zeroing the next
> > memory allocation (always running in the memory context of the main application).
> > It would be implemented by a single "background program counter" and a set
> > of registers (maybe the smallest set), so that a "rep stosb" can be done.
> > I think it would also be "optionally run", so the main application
> > would not require anything, just the memory
> > allocator would provide pre-zeroed blocks if there were some
> > available, else it would zero those blocks itself.
> > More complex uses of such background microthread could be found over time, like balancing binary trees.
>
> I'm not sure I see much future for microthreads. They might be able to extract a bit more performance
> from a CPU - but even POWER doesn't really extract all that much more with multiple independent tasks
> so would something that put in more complex depenencies do much for single tasks? Yes hardware for zeroing
> would be nice but I would consider it part of the normal code and just done in an OoO manner.
I agree, for memory zeroing, the interface could be different to such microthread (to start to zero the memory which will be allocated next): something like "delayed rep stosb", followed (in program execution order) by a real "rep stosb" which ensure that the whole block has finished its clearing before replying to that next allocation.
I still see a problem when the memory allocator gets the memory from differently sized pools, that would create a lot of such micro-threads.
I am not sure there is a lot of dependency problems while zeroing memory which is still not allocated...
I do not see a good interface for "hardware for zeroing" to begin to zero (low priority) the next memory block to be allocated.
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 22, 2021 3:22 am wrote:
> > Like other have pointed to, microthread could be used, I would
> > propose very simple form of it, without much modification.
> > I mean, usual software only have an IPC in between 0.5 and 2, far lower than the theoretical
> > maximum. Even hyper-threading cores seem to quite often have the two threads waiting.
> > IHMO we could have an explicit, hardware supported, background microthread. The OoO CPU would
> > reserve around 20 in-flight instructions for it (out of its 100 in-flight instructions).
> > So if the main program flow is stopped (waiting for memory reads), the background microthread would be run.
> > I would imagine that "background microthread" being very limited, not able to do system
> > call or change rings - but it would be sufficient to do things like pre-zeroing the next
> > memory allocation (always running in the memory context of the main application).
> > It would be implemented by a single "background program counter" and a set
> > of registers (maybe the smallest set), so that a "rep stosb" can be done.
> > I think it would also be "optionally run", so the main application
> > would not require anything, just the memory
> > allocator would provide pre-zeroed blocks if there were some
> > available, else it would zero those blocks itself.
> > More complex uses of such background microthread could be found over time, like balancing binary trees.
>
> I'm not sure I see much future for microthreads. They might be able to extract a bit more performance
> from a CPU - but even POWER doesn't really extract all that much more with multiple independent tasks
> so would something that put in more complex depenencies do much for single tasks? Yes hardware for zeroing
> would be nice but I would consider it part of the normal code and just done in an OoO manner.
I agree, for memory zeroing, the interface could be different to such microthread (to start to zero the memory which will be allocated next): something like "delayed rep stosb", followed (in program execution order) by a real "rep stosb" which ensure that the whole block has finished its clearing before replying to that next allocation.
I still see a problem when the memory allocator gets the memory from differently sized pools, that would create a lot of such micro-threads.
I am not sure there is a lot of dependency problems while zeroing memory which is still not allocated...
I do not see a good interface for "hardware for zeroing" to begin to zero (low priority) the next memory block to be allocated.