By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), August 31, 2013 10:33 am
Room: Moderated Discussions
anon (no.delete@this.thanks.com) on August 30, 2013 7:29 pm wrote:
[snip]
> Partitioned register files are not a bonus - They're something you do when you can't meet your issue width
> (# of RF ports) and frequency targets any other way. Just ask anybody who's ever developed for TI C6X.
If one ignores trade-offs in hardware implementation (including code density), the software people would probably prefer a memory-memory ISA (no need for register allocation decisions). Some have suggested that not having separate FP registers provides a noticeable advantage (e.g., the ABI for some variable argument function calls and balancing the number of registers [when the register count limit is physical rather than from encoding constraints]).
Splitting registers between addresses and data does allow a modest increase in the number of registers available for a given instruction size.
Providing information about intended use in the register name can also facilitate some microarchitectural optimizations. Even such hint-like information as the standard ABI could provide optimization opportunities.
An address/data split would also seem to be relatively easy to abandon later if there is sufficient opcode space. Adding something like a REX prefix that adds a bit per register name would effectively eliminate the distinction.
Whether such partitioning makes sense would depend on the implementation technology, the targeted workloads (including performance and efficiency goals), the cost/benefit trade-offs in compiler development (including delay and schedule risk), and the cost/benefit trade-offs in hardware development. (Evaluating such trade-offs even over a five year period would seem to be challenging, but ISAs are often expected to last decades--e.g., Alpha had "a 15- to 25-year design horizon". This is made more difficult when the initial implementations determine whether later implementations are even attempted. Such is one justification for early RISC's use of delayed branches.)
If the choice was between 8 GPRs and 8 Data Registers plus 8 Address Registers, it is likely that partitioning would provide better performance even with mediocre register allocation. Trading off a little extra compiler effort (to provide mediocre register allocation) for a little extra performance might be a reasonable choice, especially if substantial compiler effort could provide a substantial benefit.
Throwing complexity over the fence (e.g., from hardware to software) is a common problem in system design, but that danger does not justify ignoring optimization opportunities (which is distinct from recognizing that an optimization is not currently worth pursuing) and not recognizing that the trade-offs are different between software and hardware. (Deciding what information software can and should communicate to hardware is not easy. The benefits of stable and minimal/clean interfaces conflict with the changing value and ease of discovery of information.)
(I like register partitioning, but I am more ignorant of compiler issues than I am of hardware issues.)
[snip]
> Partitioned register files are not a bonus - They're something you do when you can't meet your issue width
> (# of RF ports) and frequency targets any other way. Just ask anybody who's ever developed for TI C6X.
If one ignores trade-offs in hardware implementation (including code density), the software people would probably prefer a memory-memory ISA (no need for register allocation decisions). Some have suggested that not having separate FP registers provides a noticeable advantage (e.g., the ABI for some variable argument function calls and balancing the number of registers [when the register count limit is physical rather than from encoding constraints]).
Splitting registers between addresses and data does allow a modest increase in the number of registers available for a given instruction size.
Providing information about intended use in the register name can also facilitate some microarchitectural optimizations. Even such hint-like information as the standard ABI could provide optimization opportunities.
An address/data split would also seem to be relatively easy to abandon later if there is sufficient opcode space. Adding something like a REX prefix that adds a bit per register name would effectively eliminate the distinction.
Whether such partitioning makes sense would depend on the implementation technology, the targeted workloads (including performance and efficiency goals), the cost/benefit trade-offs in compiler development (including delay and schedule risk), and the cost/benefit trade-offs in hardware development. (Evaluating such trade-offs even over a five year period would seem to be challenging, but ISAs are often expected to last decades--e.g., Alpha had "a 15- to 25-year design horizon". This is made more difficult when the initial implementations determine whether later implementations are even attempted. Such is one justification for early RISC's use of delayed branches.)
If the choice was between 8 GPRs and 8 Data Registers plus 8 Address Registers, it is likely that partitioning would provide better performance even with mediocre register allocation. Trading off a little extra compiler effort (to provide mediocre register allocation) for a little extra performance might be a reasonable choice, especially if substantial compiler effort could provide a substantial benefit.
Throwing complexity over the fence (e.g., from hardware to software) is a common problem in system design, but that danger does not justify ignoring optimization opportunities (which is distinct from recognizing that an optimization is not currently worth pursuing) and not recognizing that the trade-offs are different between software and hardware. (Deciding what information software can and should communicate to hardware is not easy. The benefits of stable and minimal/clean interfaces conflict with the changing value and ease of discovery of information.)
(I like register partitioning, but I am more ignorant of compiler issues than I am of hardware issues.)