By: Megol (golem960.delete@this.gmail.com), September 4, 2013 7:04 am
Room: Moderated Discussions
Brett (ggtgp.delete@this.yahoo.com) on August 30, 2013 6:42 pm wrote:
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on August 30, 2013 4:13 pm wrote:
> > Max (max.delete@this.a.com) on August 30, 2013 1:12 pm wrote:
> > >
> > > You allocate pointers to address registers and everything else to data registers.
> >
> > That's BS - pointers *are* data, and often a large portion of arithmetic is for
> > addresses. That's kind of the problem with the whole stupid Ax/Dx separation.
> >
> > Because quite often address calculations are done just for the *address*, and the pointer in never
> > actually dereferenced by the code in question. Yes, it's an "address", but at the same time, it's
> > purely about integer arithmetic, and it may well be better to use a data register for it.
> >
> > See? There's a fundamental core ambiguity about addresses. They really may be just plain arithmetic
> > data, and most compilers tend to actually treat them that way except for the final dereference.
> >
> > So separating addresses vs integer data is a bad idea, in a way that few other separations are not.
>
> The primary reason for separate instruction and data registers is better code density
> for the number of registers used. The m68k was used in new computers where 64k of RAM
> was not enough and so you needed 32 bit addressing, but you often had barely more than
> that. minor hassles for occasional copies between A/D sets are irrelevant whining.
>
> High end (all?) m68k designs used a unified register set internally, and like
> the x86 register copies can be taken care of by the front end and are "free".
Are they really unified? With high-end I guess you mean 68060 which is a Pentium-class processor.
Register copies of course can be "free" but doing it for the 68k is much harder than on x86: MOVE instructions sets flags. To remove copies the hardware either have to look ahead verifying that the flags will be overwritten before any use _or_ do a partial removal letting the MOVE be executed on a dedicated very simple unit. A combination can of course be used too.
> For a truly high end system the split register files are a bonus, the Alpha CPU tried splitting
> odd and even registers to separate register file ALU blocks for higher performance. Something
> we will see again, probably in the Mill CPU since they want to issue 33 operations a cycle.
Eh... The main idea behind the mill is not to have registers. And it haven't got any.
> Anyone designing a fresh from scratch high end CPU today would be thinking of adding multiple bits
> to instructions to signify register bank use. Take ARM like instructions and reuse 2 bits of the
> useless predication part of the instruction to give you 4 register banks of 16 registers.
Prediction isn't useless.
> The compiler might only to be able to use addressing from one or two banks,
> which would make hardware simpler. But false fear of m68k concerns might force
> the hardware guys to support loads and stores from all 4 register sets.
It would be simpler in an FPGA design, yes. Doubt it would help much in a modern ASIC design.
> To make things really regular you can add vector Int/FPU units to each of the four register sets. Then if you
> are IBM add an option to split the register files and run separate code on each of the four register files.
>
> A BIG-little approach that gets sales into more markets. Similar to IBM allowing you to down
> configure the 8 POWER hardware threads all the way down to 1 to run bloated code that does
> not like sharing resources, verses lots of weak threads that block and use few resources.
>
> > So the whole A/D separation was just nasty. I don't think it had anything to do with the
> > failure of the architecture, but I can easily see it being annoying to a compiler writer,
> > where you have two classes of registers that are so similar yet so different.
> >
> > Linus
>
> The ultimate failure after decades of success for the m68k series in embedded systems had to do
> with all the address modes and legacy code that used them. Which made cheap high end embedded m68k
> CPU's too hard and expensive to implement for the performance gained. Not A/D separation.
>
The failure isn't that strange. The market for 68k processors wasn't enough for designing a 68080. The market wasn't enough for continuing the 88k series RISC either. Motorola then simplified the 68k ISA into the ColdFire ISA for embedded designs (keeping near source compatibility and even for a subset binary compatibility) and began working on PowerPC which - at that time - seemed to have a bright future and would most likely replace the industrial standard architecture (=IBM PC compatibles).
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on August 30, 2013 4:13 pm wrote:
> > Max (max.delete@this.a.com) on August 30, 2013 1:12 pm wrote:
> > >
> > > You allocate pointers to address registers and everything else to data registers.
> >
> > That's BS - pointers *are* data, and often a large portion of arithmetic is for
> > addresses. That's kind of the problem with the whole stupid Ax/Dx separation.
> >
> > Because quite often address calculations are done just for the *address*, and the pointer in never
> > actually dereferenced by the code in question. Yes, it's an "address", but at the same time, it's
> > purely about integer arithmetic, and it may well be better to use a data register for it.
> >
> > See? There's a fundamental core ambiguity about addresses. They really may be just plain arithmetic
> > data, and most compilers tend to actually treat them that way except for the final dereference.
> >
> > So separating addresses vs integer data is a bad idea, in a way that few other separations are not.
>
> The primary reason for separate instruction and data registers is better code density
> for the number of registers used. The m68k was used in new computers where 64k of RAM
> was not enough and so you needed 32 bit addressing, but you often had barely more than
> that. minor hassles for occasional copies between A/D sets are irrelevant whining.
>
> High end (all?) m68k designs used a unified register set internally, and like
> the x86 register copies can be taken care of by the front end and are "free".
Are they really unified? With high-end I guess you mean 68060 which is a Pentium-class processor.
Register copies of course can be "free" but doing it for the 68k is much harder than on x86: MOVE instructions sets flags. To remove copies the hardware either have to look ahead verifying that the flags will be overwritten before any use _or_ do a partial removal letting the MOVE be executed on a dedicated very simple unit. A combination can of course be used too.
> For a truly high end system the split register files are a bonus, the Alpha CPU tried splitting
> odd and even registers to separate register file ALU blocks for higher performance. Something
> we will see again, probably in the Mill CPU since they want to issue 33 operations a cycle.
Eh... The main idea behind the mill is not to have registers. And it haven't got any.
> Anyone designing a fresh from scratch high end CPU today would be thinking of adding multiple bits
> to instructions to signify register bank use. Take ARM like instructions and reuse 2 bits of the
> useless predication part of the instruction to give you 4 register banks of 16 registers.
Prediction isn't useless.
> The compiler might only to be able to use addressing from one or two banks,
> which would make hardware simpler. But false fear of m68k concerns might force
> the hardware guys to support loads and stores from all 4 register sets.
It would be simpler in an FPGA design, yes. Doubt it would help much in a modern ASIC design.
> To make things really regular you can add vector Int/FPU units to each of the four register sets. Then if you
> are IBM add an option to split the register files and run separate code on each of the four register files.
>
> A BIG-little approach that gets sales into more markets. Similar to IBM allowing you to down
> configure the 8 POWER hardware threads all the way down to 1 to run bloated code that does
> not like sharing resources, verses lots of weak threads that block and use few resources.
>
> > So the whole A/D separation was just nasty. I don't think it had anything to do with the
> > failure of the architecture, but I can easily see it being annoying to a compiler writer,
> > where you have two classes of registers that are so similar yet so different.
> >
> > Linus
>
> The ultimate failure after decades of success for the m68k series in embedded systems had to do
> with all the address modes and legacy code that used them. Which made cheap high end embedded m68k
> CPU's too hard and expensive to implement for the performance gained. Not A/D separation.
>
The failure isn't that strange. The market for 68k processors wasn't enough for designing a 68080. The market wasn't enough for continuing the 88k series RISC either. Motorola then simplified the 68k ISA into the ColdFire ISA for embedded designs (keeping near source compatibility and even for a subset binary compatibility) and began working on PowerPC which - at that time - seemed to have a bright future and would most likely replace the industrial standard architecture (=IBM PC compatibles).