By: Wilco (Wilco.dijkstra.delete@this.ntlworld.com), December 22, 2018 10:24 am
Room: Moderated Discussions
Travis Downs (travis.downs.delete@this.gmail.com) on December 22, 2018 7:47 am wrote:
> Wilco (Wilco.dijkstra.delete@this.ntlworld.com) on December 22, 2018 7:38 am wrote:
>
> > I don't see why you think it's any different: load-op and load-op-store have 2 destinations
> > too. It doesn't matter whether you crack before or after rename: they need a renamed
> > register for the result of the load and one for the final result.
>
> The result of the load doesn't need to participate in renaming in the same way because it is not architecturally
> visible and doesn't persist beyond the instruction. It just needs to get from the load to the single op
> that will ever consume it, which is a different problem and likely off the critical path.
You could special case it since no other instruction needs to read the renamed register.
But other than that it's like any other destination that needs to be renamed. As is the flags register.
> If this were correct though, then your original hypothesis of "saves rename bandwidth" holds in even fewer cases:
> the renamer still has to deal with the full complexity of the instruction whether it is cracked before or after
> since it has to do just as much work (rename just as many destination registers) in either case.
All instructions must be renamed either way. Early cracking means more micro ops to rename, so throughput is simply lower. Now you could widen rename, but that has a high cost. A cheaper approach is to add support for a 2nd destination register.
> Note that I agree with you that rename bandwidth is saved in the case of micro-fused/uncracked-until-after
> rename ops like load-op on x86 or call/store on ARM. It's only for 2-output
> ops like auto-increment this doesn't apply.
A modern Arm core can execute 2 loads with auto-increment every cycle using just 2 rename slots. The other slots are still free for other instructions, so yes it saves rename bandwidth.
Wilco
> Wilco (Wilco.dijkstra.delete@this.ntlworld.com) on December 22, 2018 7:38 am wrote:
>
> > I don't see why you think it's any different: load-op and load-op-store have 2 destinations
> > too. It doesn't matter whether you crack before or after rename: they need a renamed
> > register for the result of the load and one for the final result.
>
> The result of the load doesn't need to participate in renaming in the same way because it is not architecturally
> visible and doesn't persist beyond the instruction. It just needs to get from the load to the single op
> that will ever consume it, which is a different problem and likely off the critical path.
You could special case it since no other instruction needs to read the renamed register.
But other than that it's like any other destination that needs to be renamed. As is the flags register.
> If this were correct though, then your original hypothesis of "saves rename bandwidth" holds in even fewer cases:
> the renamer still has to deal with the full complexity of the instruction whether it is cracked before or after
> since it has to do just as much work (rename just as many destination registers) in either case.
All instructions must be renamed either way. Early cracking means more micro ops to rename, so throughput is simply lower. Now you could widen rename, but that has a high cost. A cheaper approach is to add support for a 2nd destination register.
> Note that I agree with you that rename bandwidth is saved in the case of micro-fused/uncracked-until-after
> rename ops like load-op on x86 or call/store on ARM. It's only for 2-output
> ops like auto-increment this doesn't apply.
A modern Arm core can execute 2 loads with auto-increment every cycle using just 2 rename slots. The other slots are still free for other instructions, so yes it saves rename bandwidth.
Wilco
Topic | Posted By | Date |
---|---|---|
RISC-V Summit Proceedings | Gabriele Svelto | 2018/12/19 08:36 AM |
RISC-V gut feelings | Konrad Schwarz | 2018/12/20 04:30 AM |
RISC-V inferior to ARMv8 | Heikki Kultala | 2018/12/20 07:36 AM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/20 01:31 PM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/20 02:18 PM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/21 03:43 AM |
RISC-V inferior to ARMv8 | Ronald Maas | 2018/12/21 09:35 AM |
RISC-V inferior to ARMv8 | juanrga | 2018/12/21 10:28 AM |
RISC-V inferior to ARMv8 | Maynard Handley | 2018/12/21 02:39 PM |
RISC-V inferior to ARMv8 | anon | 2018/12/21 03:38 PM |
RISC-V inferior to ARMv8 | juanrga | 2018/12/23 04:39 AM |
With similar logic nor do frequency (NT) | Megol | 2018/12/23 09:45 AM |
RISC-V inferior to ARMv8 | juanrga | 2018/12/23 04:44 AM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/23 06:21 AM |
RISC-V inferior to ARMv8 | Michael S | 2018/12/20 03:24 PM |
RISC-V inferior to ARMv8 | anon | 2018/12/20 04:22 PM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/21 06:16 PM |
RISC-V inferior to ARMv8 | anon | 2018/12/22 03:53 AM |
Execution runtimes and Spectre | Foo_ | 2018/12/22 06:02 AM |
RISC-V inferior to ARMv8 | Adrian | 2018/12/20 08:51 PM |
RISC-V inferior to ARMv8 | Doug S | 2018/12/20 11:10 PM |
RISC-V inferior to ARMv8 | Adrian | 2018/12/20 11:38 PM |
RISC-V inferior to ARMv8 | Michael S | 2018/12/21 02:31 AM |
RISC-V inferior to ARMv8 | Adrian | 2018/12/21 03:23 AM |
RISC-V inferior to ARMv8 | random person | 2018/12/21 02:04 AM |
RISC-V inferior to ARMv8 | dmcq | 2018/12/21 04:27 AM |
RISC-V inferior to ARMv8 | juanrga | 2018/12/21 10:36 AM |
RISC-V inferior to ARMv8 | Doug S | 2018/12/21 12:02 PM |
RISC-V inferior to ARMv8 | juanrga | 2018/12/21 10:23 AM |
RISC-V inferior to ARMv8 | Adrian | 2018/12/20 11:21 PM |
RISC-V inferior to ARMv8 | anon | 2018/12/21 01:48 AM |
RISC-V inferior to ARMv8 | Adrian | 2018/12/21 03:44 AM |
RISC-V inferior to ARMv8 | anon | 2018/12/21 05:24 AM |
RISC-V inferior to ARMv8 | Adrian | 2018/12/21 04:09 AM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/21 04:28 AM |
RISC-V inferior to ARMv8 | Michael S | 2018/12/21 02:27 AM |
RISC-V inferior to ARMv8 | Gabriele Svelto | 2018/12/21 01:09 PM |
RISC-V inferior to ARMv8 | Maynard Handley | 2018/12/21 02:58 PM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/21 03:43 PM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/21 05:45 PM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/22 04:37 AM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/22 06:54 AM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/22 10:32 AM |
Cracking is not free | Gabriele Svelto | 2018/12/22 02:09 AM |
Cracking is not free | Wilco | 2018/12/22 04:32 AM |
Cracking is not free | Travis Downs | 2018/12/22 07:07 AM |
Cracking is not free | Wilco | 2018/12/22 07:38 AM |
Cracking is not free | Travis Downs | 2018/12/22 07:47 AM |
Cracking is not free | Wilco | 2018/12/22 10:24 AM |
Cracking is not free | Travis Downs | 2018/12/25 03:41 PM |
Cracking is not free | anon.1 | 2018/12/25 08:14 PM |
multi-instruction decode and rename | Paul A. Clayton | 2018/12/22 06:45 PM |
Cracking is not free | Gabriele Svelto | 2018/12/22 12:30 PM |
Cracking is not free | Wilco | 2018/12/23 06:48 AM |
Cracking is not free | Michael S | 2018/12/23 08:09 AM |
Cracking is not free | Gabriele Svelto | 2018/12/26 02:53 PM |
RISC-V inferior to ARMv8 | rwessel | 2018/12/21 01:13 PM |
RISC-V inferior to ARMv8 | Seni | 2018/12/21 02:33 PM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/21 03:33 PM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/21 05:49 PM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/22 04:58 AM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/22 07:03 AM |
RISC-V inferior to ARMv8 | Wilco | 2018/12/22 07:22 AM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/22 07:40 AM |
RISC-V inferior to ARMv8 | dmcq | 2018/12/21 03:57 AM |
RISC-V inferior to ARMv8 | Konrad Schwarz | 2018/12/21 02:25 AM |
RISC-V inferior to ARMv8 | j | 2018/12/21 10:46 AM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/21 06:08 PM |
RISC-V inferior to ARMv8 | dmcq | 2018/12/22 07:45 AM |
RISC-V inferior to ARMv8 | Travis Downs | 2018/12/22 07:50 AM |
RISC-V inferior to ARMv8 | Michael S | 2018/12/22 08:15 AM |
RISC-V inferior to ARMv8 | dmcq | 2018/12/22 10:41 AM |
RISC-V inferior to ARMv8 | AnonQ | 2018/12/22 08:13 AM |
RISC-V gut feelings | dmcq | 2018/12/20 07:41 AM |
RISC-V initial take | Konrad Schwarz | 2018/12/21 02:17 AM |
RISC-V initial take | dmcq | 2018/12/21 03:23 AM |
RISC-V gut feelings | Montaray Jack | 2018/12/22 02:56 PM |
RISC-V gut feelings | dmcq | 2018/12/23 04:38 AM |
RISC-V Summit Proceedings | juanrga | 2018/12/21 10:47 AM |
RISC-V Summit Proceedings | dmcq | 2018/12/22 06:21 AM |
RISC-V Summit Proceedings | Montaray Jack | 2018/12/22 02:03 PM |
RISC-V Summit Proceedings | dmcq | 2018/12/23 04:39 AM |
RISC-V Summit Proceedings | anon2 | 2018/12/21 10:57 AM |
RISC-V Summit Proceedings | Michael S | 2018/12/22 08:36 AM |
RISC-V Summit Proceedings | Anon | 2018/12/22 05:51 PM |
Not Stanford MIPS but commercial MIPS | Paul A. Clayton | 2018/12/23 03:05 AM |
Not Stanford MIPS but commercial MIPS | Michael S | 2018/12/23 03:49 AM |
Not Stanford MIPS but commercial MIPS | dmcq | 2018/12/23 04:52 AM |