By: Andrey (andrey.semashev.delete@this.gmail.com), September 29, 2021 7:58 pm
Room: Moderated Discussions
dmcq (dmcq.delete@this.fano.co.uk) on September 29, 2021 3:52 pm wrote:
> rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 1:58 pm wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on September 29, 2021 1:44 pm wrote:
> > > rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 11:35 am wrote:
> > > > dmcq (dmcq.delete@this.fano.co.uk) on September 29, 2021 7:53 am wrote:
> > > > > rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 6:55 am wrote:
> > > > > > NoSpammer (no.delete@this.spam.com) on September 29, 2021 3:53 am wrote:
> > > > > > > dmcq (dmcq.delete@this.fano.co.uk) on September 28, 2021 2:21 pm wrote:
> > > > > > > > The bits could vary between implementations so letting designers optimise better. More conditions
> > > > > > > > could be saved. The only problem I can see is big-little systems and they could just zero the bits
> > > > > > > > if moving between different cores - with the current system how can we tell if the form optimised
> > > > > > > > by one is okay for the other? And yes it seems like an unnecessary waste of opcodes and code space.
> > > > > > >
> > > > > > > I think 3 instructions make very simple implementations possible for the low-end. Per example:
> > > > > > > Initial instruction is movsb until aligned or end.
> > > > > > > Middle instruction is movs[your core's biggest R/W chunk]
> > > > > > > End instruction is movsb until end.
> > > > > > >
> > > > > > > Why have more state when the state can already be in the registers and PC?
> > > > > >
> > > > > >
> > > > > > Remember that in the general case, you can't get both operands aligned, so you can't avoid
> > > > > > dealing with at least some of that in the "middle" instruction. But the point is, how
> > > > > > hard could it be to detect the case where the initial or final instruction apply?
> > > > >
> > > > > And thinking yet again about it why bother even saving any
> > > > > bits? AT most it can get by with some bits that are
> > > > > passed between the micro-ops. If an interrupt happs one can
> > > > > just dump the bits and start anew from where stopped.
> > > > > I'm basically saying just have one op and if they want three
> > > > > have the one op split into three. Then there's no
> > > > > need to start wondering about describng what the state is like when restarting the second operation.
> > > >
> > > >
> > > > So update registers with intermediate state like 8x86 movs or S/370 mvcl? ;-)
> > >
> > > Well that does seem an obvious way to handle interrupts. But it could execute as if it was a number of
> > > operations with some extra state being passed between the stages, except the extra information can be
> > > discarded if an interrupt happens. So the first operation would set up a couple of microoperations. Then
> > > more would be generated dependent on each other till the count went to zero. An interrupt would just stop
> > > it after one of the operations had finished and updated the registers showing how much had been done.
> > > You want it to work okay in a longer pipeline so operations after the move can be done in parallel.
> >
> > Sure, so long as the CPU was executing the move, externally visible state doesn't need to
> > be updated. But so long as it's architected (like x86 movs), there's no problem moving between
> > cores that might maintain some sort state internally while running a multi-word move.
>
> It's bit of a pity though if the actual end state is guaranteed like for x86 or mvcl. It means one
> can't implement memmove because sometimes one would want to go forwards and other times backwards.
In x86, rep movs can go either way, depending on the direction flag.
> rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 1:58 pm wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on September 29, 2021 1:44 pm wrote:
> > > rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 11:35 am wrote:
> > > > dmcq (dmcq.delete@this.fano.co.uk) on September 29, 2021 7:53 am wrote:
> > > > > rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 6:55 am wrote:
> > > > > > NoSpammer (no.delete@this.spam.com) on September 29, 2021 3:53 am wrote:
> > > > > > > dmcq (dmcq.delete@this.fano.co.uk) on September 28, 2021 2:21 pm wrote:
> > > > > > > > The bits could vary between implementations so letting designers optimise better. More conditions
> > > > > > > > could be saved. The only problem I can see is big-little systems and they could just zero the bits
> > > > > > > > if moving between different cores - with the current system how can we tell if the form optimised
> > > > > > > > by one is okay for the other? And yes it seems like an unnecessary waste of opcodes and code space.
> > > > > > >
> > > > > > > I think 3 instructions make very simple implementations possible for the low-end. Per example:
> > > > > > > Initial instruction is movsb until aligned or end.
> > > > > > > Middle instruction is movs[your core's biggest R/W chunk]
> > > > > > > End instruction is movsb until end.
> > > > > > >
> > > > > > > Why have more state when the state can already be in the registers and PC?
> > > > > >
> > > > > >
> > > > > > Remember that in the general case, you can't get both operands aligned, so you can't avoid
> > > > > > dealing with at least some of that in the "middle" instruction. But the point is, how
> > > > > > hard could it be to detect the case where the initial or final instruction apply?
> > > > >
> > > > > And thinking yet again about it why bother even saving any
> > > > > bits? AT most it can get by with some bits that are
> > > > > passed between the micro-ops. If an interrupt happs one can
> > > > > just dump the bits and start anew from where stopped.
> > > > > I'm basically saying just have one op and if they want three
> > > > > have the one op split into three. Then there's no
> > > > > need to start wondering about describng what the state is like when restarting the second operation.
> > > >
> > > >
> > > > So update registers with intermediate state like 8x86 movs or S/370 mvcl? ;-)
> > >
> > > Well that does seem an obvious way to handle interrupts. But it could execute as if it was a number of
> > > operations with some extra state being passed between the stages, except the extra information can be
> > > discarded if an interrupt happens. So the first operation would set up a couple of microoperations. Then
> > > more would be generated dependent on each other till the count went to zero. An interrupt would just stop
> > > it after one of the operations had finished and updated the registers showing how much had been done.
> > > You want it to work okay in a longer pipeline so operations after the move can be done in parallel.
> >
> > Sure, so long as the CPU was executing the move, externally visible state doesn't need to
> > be updated. But so long as it's architected (like x86 movs), there's no problem moving between
> > cores that might maintain some sort state internally while running a multi-word move.
>
> It's bit of a pity though if the actual end state is guaranteed like for x86 or mvcl. It means one
> can't implement memmove because sometimes one would want to go forwards and other times backwards.
In x86, rep movs can go either way, depending on the direction flag.