By: NoSpammer (no.delete@this.spam.com), August 4, 2022 3:18 am
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on August 3, 2022 9:25 am wrote:
> NoSpammer (no.delete@this.spam.com) on August 3, 2022 7:55 am wrote:
> > So mainstream architectures that used to use stack-like structures like intel x87
> > and archs using register windows being obsoleted are not a practical proof?
>
> But there is another way to explain that, stacks were very unfriendly to in order superscalar CPUs,
> then anyone caring about performance stoped using stacks and we continue not using by inercia.
>
> When CPUs implemented stack engines the difference reduced
> a lot, but nobody wanted to go back using stacks anymore.
I guess, apart from stack housekeeping, the primary reason is many instructions have high or variable latency, which means that you will not be able to reuse the top of stack immediately. So you either need to shuffle or you continue with dependent instruction anyways and let OOO handle that. But still, for optimal execution it's more optimal to put instructions closer to the order of execution, to release resources earlier. Even if you use stacks the optimal number of addressable sources will be close to what the register requirements studies have found, so you will have to address about 16-32 somehow, if not directly then you will be shuffling like x87. So I see no advantage for stacks.
> > I think it's quite clear that is not a compiler friendly target for current compiler tech.
>
> Wrong. Have you ever write a compiler? Targeting a stack is much easier than a register based
> architecture, Java and .Net intermediate languages are stack based exactly because of that.
Actually I have. Targeting a stack seems like a natural thing to do if your thinking is one equation at a time LR(1) one pass compilation, kinda like 80s style simplistic compilers. If you start thinking about optimizations, optimal temp var allocation, variable life-time, cross procedural optimization... then stack kinda gets in the way.
Intermediate languages are defined stack based because they don't want to impose register count, because that is relatively easy for interpreter and because eventually they can be further optimized and compiled to whatever anyways.
> > It's also quite clear that you will be burning power resolving a layer of indirection.
>
> And burning less power by loading less code, it is not clear to me which would be lower.
Compare that to stack housekeeping and needing to resolve some of the semantic of the instructions before you are even able to evaluate dependencies properly. My bet would be that you would already need to predecode to get even on par at renaming stage.
> > It's also quite clear that you will be wasting time shuffling stack order at many block edges.
>
> As this would be done in parallel to actual execution the is no wasted time.
>
> I don't know if it is possible to design a high performance stack based
> archtecture, I think it is, but I also know nobody have tried.
It might be possible but I certainly would not recommend it to someone designing a new general purpose architecture.
> NoSpammer (no.delete@this.spam.com) on August 3, 2022 7:55 am wrote:
> > So mainstream architectures that used to use stack-like structures like intel x87
> > and archs using register windows being obsoleted are not a practical proof?
>
> But there is another way to explain that, stacks were very unfriendly to in order superscalar CPUs,
> then anyone caring about performance stoped using stacks and we continue not using by inercia.
>
> When CPUs implemented stack engines the difference reduced
> a lot, but nobody wanted to go back using stacks anymore.
I guess, apart from stack housekeeping, the primary reason is many instructions have high or variable latency, which means that you will not be able to reuse the top of stack immediately. So you either need to shuffle or you continue with dependent instruction anyways and let OOO handle that. But still, for optimal execution it's more optimal to put instructions closer to the order of execution, to release resources earlier. Even if you use stacks the optimal number of addressable sources will be close to what the register requirements studies have found, so you will have to address about 16-32 somehow, if not directly then you will be shuffling like x87. So I see no advantage for stacks.
> > I think it's quite clear that is not a compiler friendly target for current compiler tech.
>
> Wrong. Have you ever write a compiler? Targeting a stack is much easier than a register based
> architecture, Java and .Net intermediate languages are stack based exactly because of that.
Actually I have. Targeting a stack seems like a natural thing to do if your thinking is one equation at a time LR(1) one pass compilation, kinda like 80s style simplistic compilers. If you start thinking about optimizations, optimal temp var allocation, variable life-time, cross procedural optimization... then stack kinda gets in the way.
Intermediate languages are defined stack based because they don't want to impose register count, because that is relatively easy for interpreter and because eventually they can be further optimized and compiled to whatever anyways.
> > It's also quite clear that you will be burning power resolving a layer of indirection.
>
> And burning less power by loading less code, it is not clear to me which would be lower.
Compare that to stack housekeeping and needing to resolve some of the semantic of the instructions before you are even able to evaluate dependencies properly. My bet would be that you would already need to predecode to get even on par at renaming stage.
> > It's also quite clear that you will be wasting time shuffling stack order at many block edges.
>
> As this would be done in parallel to actual execution the is no wasted time.
>
> I don't know if it is possible to design a high performance stack based
> archtecture, I think it is, but I also know nobody have tried.
It might be possible but I certainly would not recommend it to someone designing a new general purpose architecture.