By: rwessel (rwessel.delete@this.yahoo.com), September 29, 2021 10:30 am
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on September 29, 2021 10:10 am wrote:
> rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 6:55 am wrote:
> > NoSpammer (no.delete@this.spam.com) on September 29, 2021 3:53 am wrote:
> > > dmcq (dmcq.delete@this.fano.co.uk) on September 28, 2021 2:21 pm wrote:
> > > > The bits could vary between implementations so letting designers optimise better. More conditions
> > > > could be saved. The only problem I can see is big-little systems and they could just zero the bits
> > > > if moving between different cores - with the current system how can we tell if the form optimised
> > > > by one is okay for the other? And yes it seems like an unnecessary waste of opcodes and code space.
> > >
> > > I think 3 instructions make very simple implementations possible for the low-end. Per example:
> > > Initial instruction is movsb until aligned or end.
> > > Middle instruction is movs[your core's biggest R/W chunk]
> > > End instruction is movsb until end.
> > >
> > > Why have more state when the state can already be in the registers and PC?
> >
> >
> > Remember that in the general case, you can't get both operands aligned, so you can't avoid
> > dealing with at least some of that in the "middle" instruction. But the point is, how
> > hard could it be to detect the case where the initial or final instruction apply?
>
>
> Why would you need to deal with any alignment related issues in the middle instruction? If
> the area of memory you are operating upon is large enough you have a middle sequence where
> you can load and store on aligned boundaries at your largest chunk size (and there may even
> be some clever ways to massively speed this up with some sort of cache aliasing trickery)
>
> In small operations the middle instruction may be a no-op, and only the first and/or last instruction actually
> do something. I can't see any case where you would need to worry about alignment in the middle instruction
> - that's the whole point of splitting it up this way! Can you provide an example of where you think the
> first or third instruction would be unable to guarantee the middle instruction perfect alignment?
You can trivially align one of the two operands for the middle instruction, but not necessarily both.
Consider memcpy(123, 345, 100);
So you could have the first instruction move five bytes, and that leaves you with memcpy(128, 350, 95). The second operand remains unaligned. Or move seven bytes with the first instruction, which leaves you with memcpy(130, 352, 93), and an unaligned first operand.
While knowing that one operand is aligned may well be of value to the middle instruction, it's going to have to deal with the possibility of the other operand being unaligned. Assuming you'd align the first operand, it would still need to do fetch/shift/merge on each word of the second operand.
> rwessel (rwessel.delete@this.yahoo.com) on September 29, 2021 6:55 am wrote:
> > NoSpammer (no.delete@this.spam.com) on September 29, 2021 3:53 am wrote:
> > > dmcq (dmcq.delete@this.fano.co.uk) on September 28, 2021 2:21 pm wrote:
> > > > The bits could vary between implementations so letting designers optimise better. More conditions
> > > > could be saved. The only problem I can see is big-little systems and they could just zero the bits
> > > > if moving between different cores - with the current system how can we tell if the form optimised
> > > > by one is okay for the other? And yes it seems like an unnecessary waste of opcodes and code space.
> > >
> > > I think 3 instructions make very simple implementations possible for the low-end. Per example:
> > > Initial instruction is movsb until aligned or end.
> > > Middle instruction is movs[your core's biggest R/W chunk]
> > > End instruction is movsb until end.
> > >
> > > Why have more state when the state can already be in the registers and PC?
> >
> >
> > Remember that in the general case, you can't get both operands aligned, so you can't avoid
> > dealing with at least some of that in the "middle" instruction. But the point is, how
> > hard could it be to detect the case where the initial or final instruction apply?
>
>
> Why would you need to deal with any alignment related issues in the middle instruction? If
> the area of memory you are operating upon is large enough you have a middle sequence where
> you can load and store on aligned boundaries at your largest chunk size (and there may even
> be some clever ways to massively speed this up with some sort of cache aliasing trickery)
>
> In small operations the middle instruction may be a no-op, and only the first and/or last instruction actually
> do something. I can't see any case where you would need to worry about alignment in the middle instruction
> - that's the whole point of splitting it up this way! Can you provide an example of where you think the
> first or third instruction would be unable to guarantee the middle instruction perfect alignment?
You can trivially align one of the two operands for the middle instruction, but not necessarily both.
Consider memcpy(123, 345, 100);
So you could have the first instruction move five bytes, and that leaves you with memcpy(128, 350, 95). The second operand remains unaligned. Or move seven bytes with the first instruction, which leaves you with memcpy(130, 352, 93), and an unaligned first operand.
While knowing that one operand is aligned may well be of value to the middle instruction, it's going to have to deal with the possibility of the other operand being unaligned. Assuming you'd align the first operand, it would still need to do fetch/shift/merge on each word of the second operand.