By: anon (spam.delete.delete.delete@this.this.this.spam.com), December 21, 2018 1:48 am
Adrian (a.delete@this.acm.org) on December 20, 2018 8:51 pm wrote:
> Michael S (already5chosen.delete@this.yahoo.com) on December 20, 2018 3:24 pm wrote:
> >
> > I agree with everything you said except pre-indexed/post-indexed addressing modes.
> > Those, IMHO, are misfeatures, esp. for integer load instructions.
> I do not understand why you believe that this are misfeatures.
> The auto-indexed addressing modes introduced by IBM 801 and copied by ARM, PA-RISC, POWER and others
> are the only way of coding any kind of loop without any extra instructions for address computations.

In your experience how many of the loops with a fixed increment/decrement do not have a counter that could be used for the adresses?
Also I'm not sure if works for absolutely any kind of loop, if you're not working with a fixed stride you're going to have a problem.

> The previous kinds of auto-indexed addressing modes allowed only a small set of increments
> or decrements, so they were suitable only for certain kinds of loops, not for any loop.

I think we'd be happy with adressing for most kinds of loops, since that is vastly better than current RISC-V, so being suitable for all loops shouldn't be a requirement.

> The only other addressing mode that can be chosen as an alternative for the IBM auto-indexed addressing,
> because it also allows the elimination of the extra instructions in many kinds of loops, including in the
> most frequent, but not in all loops, is the 3-component addressing mode (base, index & shift) introduced
> by VAX and also adopted by Intel 80386. However, the implementation of this addressing mode seems more difficult,
> because even many modern processors do not succeed to perform it at maximum speed in all cases.

Base + index works if you put the shift in the counter. More work for the compiler but basically standard these days.

> When neither IBM auto-indexed modes nor VAX 3-component addressing are available, there are
> many kinds of loops which cannot be coded with a minimal number of instructions because address
> computation instructions must be added besides the data handling instructions.

Isn't index + shift usually counted as one component so it's only a 3-componend adressing mode with displacement on top of it?
Either way base + index register even without the shift would help a lot.

> The RISC-V fans argue that the extra instructions do not matter, because a fast implementation will fuse
> the address computation instructions with the data handling instructions, achieving the same throughput.
> I do not agree, because I believe that it is stupid to code the address computation with an extra instruction
> word, when the same thing can be encoded with a couple of bits in an addressing mode field and the instruction
> decoder is also certainly simpler than the one that must fuse those instruction pairs.

Yeah you're also throwing away fetch bandwidth.
RISC-V seems full of decisions based on "oh but that means the compiler would have to make a decision, let's just solve that in hardware" and then they do absolutely nothing about it in hardware.

I mean both pre- and post-increment/decrement is a bit much and I can see why they wouldn't want instructions that write to two registers or have 3 or even 4 inputs (base, index, shift and displacement) but surely two simple 2 in 1 out load instructions with base + index register and base + displacement/immediate wouldn't have killed them. You really have to be a fanatical purist if you believe that the AGU shouldn't be allowed to do any calculations.
