By: blaine (myname.delete@this.acm.org), August 3, 2022 12:03 pm
Room: Moderated Discussions
NoSpammer (no.delete@this.spam.com) on August 3, 2022 7:55 am wrote:
> Adrian (a.delete@this.acm.org) on August 3, 2022 2:19 am wrote:
> > Mark Roulo (nothanks.delete@this.xxx.com) on August 2, 2022 6:48 pm wrote:
> > >Stack or memory based architecture (you really want registers)
> >
> >
> > Nobody has proven yet whether encoding the operands implicitly with operations defined
> > on a stack is better or worse than encoding the operands using register numbers.
>
> So mainstream architectures that used to use stack-like structures like intel x87
> and archs using register windows being obsoleted are not a practical proof?
>
> From the information theoretical perspective it's actually likely that stack-based encoding would have code
> density advantage simply due to the fact that code is more likely to reuse most recent results, BUT... it's
> likely that you can devise alternative encodings using shorthand instructions to refer to the last few results,
> too. It's just that nobody has researched that because it became clear a long time ago that symmetric access
> to the registers is in fact preferred if you want to make anything like a decent compiler.
>
> Also, if you actually look at a dependency graph of many relevant algos, it looks more like
> a lattice than a tree that would suggest it naturally maps to stack based computation.
>
> > The problem with the early stack-based instruction encodings, like Intel
> > 8087, was that they used ONLY ONE STACK, not that they were stack based.
>
> How would N+1 stacks improve performance for general purpose
> code and compilers? How about spillovers and function calls?
>
> > In modern OoO CPUs, the register numbers used in the instruction encodings are no longer
> > addresses, but only means to specify the data dependencies between instructions. The same
> > data dependencies can be specified implicitly when the operand belong to a stack.
>
> I would correct you that the register numbers are used to specify EXPLICIT dependencies between instructions,
> which is very important if you want to parse dependencies quickly without additional indirection.
>
> > Allowing each instruction to specify 1 of 4 operand stacks might be enough
> > to remove most false dependencies, while specifying 1 of 8 operand stacks
> > is pretty much certain to prevent the apparition of false dependencies.
>
> Sure if 8 registers only works kinda ok for most code, 8 stacks will do. Maybe you actually
> keep the top of the stack in the register and spill over the stack, oh look, x86.
>
> > There is no doubt that a stack-based encoding with multiple operand stacks would increase a lot the code
> > density. What is not clear is how much more complex would be the implementation of such a CPU in comparison
> > with using register numbers for operands, i.e. if the improvement in code density would be worthwhile.
>
> I think it's quite clear that is not a compiler friendly target for current compiler tech.
> It's also quite clear that you will be burning power resolving a layer of indirection.
> It's also quite clear that you will be wasting time shuffling stack order at many block edges.
>
> > The most complex issues with a stack-base encoding would be to determine which are the best strategies
> > for avoiding operand stack overflow and for what to save or not save across function invocations.
>
> Here you are again in the territory where static code analysis cannot give you the
> answer for dynamic code, virtual functions, trees of logic before/between FP etc.
>
Also be ware that you understand the workload your are designing for. An embedded controller WL may have different needs than a general purpose computer.
> Adrian (a.delete@this.acm.org) on August 3, 2022 2:19 am wrote:
> > Mark Roulo (nothanks.delete@this.xxx.com) on August 2, 2022 6:48 pm wrote:
> > >
> >
> >
> > Nobody has proven yet whether encoding the operands implicitly with operations defined
> > on a stack is better or worse than encoding the operands using register numbers.
>
> So mainstream architectures that used to use stack-like structures like intel x87
> and archs using register windows being obsoleted are not a practical proof?
>
> From the information theoretical perspective it's actually likely that stack-based encoding would have code
> density advantage simply due to the fact that code is more likely to reuse most recent results, BUT... it's
> likely that you can devise alternative encodings using shorthand instructions to refer to the last few results,
> too. It's just that nobody has researched that because it became clear a long time ago that symmetric access
> to the registers is in fact preferred if you want to make anything like a decent compiler.
>
> Also, if you actually look at a dependency graph of many relevant algos, it looks more like
> a lattice than a tree that would suggest it naturally maps to stack based computation.
>
> > The problem with the early stack-based instruction encodings, like Intel
> > 8087, was that they used ONLY ONE STACK, not that they were stack based.
>
> How would N+1 stacks improve performance for general purpose
> code and compilers? How about spillovers and function calls?
>
> > In modern OoO CPUs, the register numbers used in the instruction encodings are no longer
> > addresses, but only means to specify the data dependencies between instructions. The same
> > data dependencies can be specified implicitly when the operand belong to a stack.
>
> I would correct you that the register numbers are used to specify EXPLICIT dependencies between instructions,
> which is very important if you want to parse dependencies quickly without additional indirection.
>
> > Allowing each instruction to specify 1 of 4 operand stacks might be enough
> > to remove most false dependencies, while specifying 1 of 8 operand stacks
> > is pretty much certain to prevent the apparition of false dependencies.
>
> Sure if 8 registers only works kinda ok for most code, 8 stacks will do. Maybe you actually
> keep the top of the stack in the register and spill over the stack, oh look, x86.
>
> > There is no doubt that a stack-based encoding with multiple operand stacks would increase a lot the code
> > density. What is not clear is how much more complex would be the implementation of such a CPU in comparison
> > with using register numbers for operands, i.e. if the improvement in code density would be worthwhile.
>
> I think it's quite clear that is not a compiler friendly target for current compiler tech.
> It's also quite clear that you will be burning power resolving a layer of indirection.
> It's also quite clear that you will be wasting time shuffling stack order at many block edges.
>
> > The most complex issues with a stack-base encoding would be to determine which are the best strategies
> > for avoiding operand stack overflow and for what to save or not save across function invocations.
>
> Here you are again in the territory where static code analysis cannot give you the
> answer for dynamic code, virtual functions, trees of logic before/between FP etc.
>
Also be ware that you understand the workload your are designing for. An embedded controller WL may have different needs than a general purpose computer.