By: Andrey (andrey.semashev.delete@this.gmail.com), February 25, 2021 6:54 am
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on February 25, 2021 5:04 am wrote:
> Andrey (andrey.semashev.delete@this.gmail.com) on February 25, 2021 3:06 am wrote:
> > The ISA must specify the instruction behavior in
> > every use case, even though in some cases that behavior might be patologic or a hardware exception.
>
> Not exactly, even in x86 there are some cases, like BSF/BSR that are undefined for a specific input (in this
> case: 0) what happens is that the implementation generate a result and programmers start relying in that
> result, so, there are some merits in not leaving obvious and common undefined, like in BSR/BSF case.
IMHO, BSR/BSF being unspecified for 0 input is a bug more than anything.
> But in the memcpy case the hypotetical undefined behaviour is the overlapping physical
> pages which is hard to detect and handle even in hardware (virtual pages overlapping is
> easy to handle), but this case is so weird that nobody would care or rely on the results,
> I mean, there is no point in doing memcpy in this case, unlike the BSR/BSF case.
Thing is, if it's so good, people would likely just use this new memcpy instruction as they do use memcpy function now (also because the function would eventually use the instruction internally). At some point someone would just use it in the overlapping case, knowingly or not, and at that point it's the same situation as with BSR/BSF - it has to produce some result, and that result can become the one software depends on.
We had this debacle with glibc's memcpy, when at some point one of its optimizations broke the (invalid) assumption of its copying direction, which broke some software. Gladly, glibc was able to keep the optimization and force the downstream software to be fixed. Imagine the situation like this with Intel and a memcpy instruction. I bet Intel would have to set the instruction behavior in stone, however inefficient and inconvenient it would be both for hardware and software. At best, they would introduce memcpy2 instead, as they did with BSR/BSF. So, if noone benefits from it, what is the point in leaving functional aspects of instruction behavior undefined?
> Andrey (andrey.semashev.delete@this.gmail.com) on February 25, 2021 3:06 am wrote:
> > The ISA must specify the instruction behavior in
> > every use case, even though in some cases that behavior might be patologic or a hardware exception.
>
> Not exactly, even in x86 there are some cases, like BSF/BSR that are undefined for a specific input (in this
> case: 0) what happens is that the implementation generate a result and programmers start relying in that
> result, so, there are some merits in not leaving obvious and common undefined, like in BSR/BSF case.
IMHO, BSR/BSF being unspecified for 0 input is a bug more than anything.
> But in the memcpy case the hypotetical undefined behaviour is the overlapping physical
> pages which is hard to detect and handle even in hardware (virtual pages overlapping is
> easy to handle), but this case is so weird that nobody would care or rely on the results,
> I mean, there is no point in doing memcpy in this case, unlike the BSR/BSF case.
Thing is, if it's so good, people would likely just use this new memcpy instruction as they do use memcpy function now (also because the function would eventually use the instruction internally). At some point someone would just use it in the overlapping case, knowingly or not, and at that point it's the same situation as with BSR/BSF - it has to produce some result, and that result can become the one software depends on.
We had this debacle with glibc's memcpy, when at some point one of its optimizations broke the (invalid) assumption of its copying direction, which broke some software. Gladly, glibc was able to keep the optimization and force the downstream software to be fixed. Imagine the situation like this with Intel and a memcpy instruction. I bet Intel would have to set the instruction behavior in stone, however inefficient and inconvenient it would be both for hardware and software. At best, they would introduce memcpy2 instead, as they did with BSR/BSF. So, if noone benefits from it, what is the point in leaving functional aspects of instruction behavior undefined?