By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), May 18, 2013 5:41 am
Room: Moderated Discussions
aaron spink (aaronspink.delete@this.notearthlink.net) on May 17, 2013 8:55 pm wrote:
[snip]
> Eh? Those weren't the flaws of the early RISC ISAs. Most of the flaws of the early RISC ISAs was exposing fundamental
> micro-architectural details into the ISA. Examples included branch delay slots (MIPS, SPARC, PA-RISC) and load
> delay slots (MIPS). And personally I would probably add rotating register-files to the list.
As long as the delay slot instruction did not use the register being written by the load, dropping this feature would still provide binary compatibility. I do not know if such non-use was urged by the original MIPS documents, but I received the impression that such use was at least primarily found in code testing that behavior (i.e., not generated by compilers). I think it was even the case that such delay behavior was unreliable, that a cache miss would generate the interlock (I thinks such was mentioned in an comp.arch post at some point).
While this "software should not" method could also be applied to using the undefined most significant bits of addresses for additional storage (I seem to recall that the M68k did this--told developers not to use but did not generate exceptions if the bits were used--and reaped the consequences), I suspect that for delayed load in a 31-register ISA the incentive to use the load's target register as an operand source for the delay slot instruction would be very small, especially if (as might have been the case) one had to assume a cache hit and no interrupts (that would make using the load target register in the delay slot an extremely funky optimization--worse than using ordinary load/store for an [interrupt] atomic RMW).
[snip]
> Eh? Those weren't the flaws of the early RISC ISAs. Most of the flaws of the early RISC ISAs was exposing fundamental
> micro-architectural details into the ISA. Examples included branch delay slots (MIPS, SPARC, PA-RISC) and load
> delay slots (MIPS). And personally I would probably add rotating register-files to the list.
As long as the delay slot instruction did not use the register being written by the load, dropping this feature would still provide binary compatibility. I do not know if such non-use was urged by the original MIPS documents, but I received the impression that such use was at least primarily found in code testing that behavior (i.e., not generated by compilers). I think it was even the case that such delay behavior was unreliable, that a cache miss would generate the interlock (I thinks such was mentioned in an comp.arch post at some point).
While this "software should not" method could also be applied to using the undefined most significant bits of addresses for additional storage (I seem to recall that the M68k did this--told developers not to use but did not generate exceptions if the bits were used--and reaped the consequences), I suspect that for delayed load in a 31-register ISA the incentive to use the load's target register as an operand source for the delay slot instruction would be very small, especially if (as might have been the case) one had to assume a cache hit and no interrupts (that would make using the load target register in the delay slot an extremely funky optimization--worse than using ordinary load/store for an [interrupt] atomic RMW).