By: Anon (no.delete@this.spam.com), February 25, 2021 7:33 am
Room: Moderated Discussions
Andrey (andrey.semashev.delete@this.gmail.com) on February 25, 2021 5:54 am wrote:
> We had this debacle with glibc's memcpy, when at some point one of its optimizations broke the
> (invalid) assumption of its copying direction, which broke some software. Gladly, glibc was able
> to keep the optimization and force the downstream software to be fixed. Imagine the situation like
> this with Intel and a memcpy instruction. I bet Intel would have to set the instruction behavior
> in stone, however inefficient and inconvenient it would be both for hardware and software. At best,
> they would introduce memcpy2 instead, as they did with BSR/BSF. So, if noone benefits from it,
> what is the point in leaving functional aspects of instruction behavior undefined?
Personally I'd put the effort on detecting the hard case and improving the existing instruction.
But I understand Linus, the hard case (physical overlap, virtual non-overlap) is so weird that any software relying on any specific behaviour in this case is likely buggy anyway.
> We had this debacle with glibc's memcpy, when at some point one of its optimizations broke the
> (invalid) assumption of its copying direction, which broke some software. Gladly, glibc was able
> to keep the optimization and force the downstream software to be fixed. Imagine the situation like
> this with Intel and a memcpy instruction. I bet Intel would have to set the instruction behavior
> in stone, however inefficient and inconvenient it would be both for hardware and software. At best,
> they would introduce memcpy2 instead, as they did with BSR/BSF. So, if noone benefits from it,
> what is the point in leaving functional aspects of instruction behavior undefined?
Personally I'd put the effort on detecting the hard case and improving the existing instruction.
But I understand Linus, the hard case (physical overlap, virtual non-overlap) is so weird that any software relying on any specific behaviour in this case is likely buggy anyway.