By: Maynard Handley (name99.delete@this.name99.org), January 16, 2019 5:01 pm

Room: Moderated Discussions

My legions of fans (hah!) will know that one of my constant themes for future IPC boosts is more aggressive fusions. With that in mind, the following check-in to LLVM is rather interesting, capturing the largest easy pattern set that I think has been missing from discussions:

https://reviews.llvm.org/D56572

The author is on team Samsung, which suggests that it's either in M3 or scheduled for M4. Which (I expect) suggests that it's likely already in Apple. (Apple have long maintained their own private LLVM branch which stays in sync with the public branch, but which allows them to make these sorts of changes without the rest of us being able to learn anything :-( )

The immediate payoff from this sort of fusion is you avoid the intermediate register allocation which is already nice and all; but the even bigger win is if you use the fact that ALU logic is so shallow (compared to the harder steps in your pipeline) to double-pump the ALU and do both fused ops in a single cycle. It will be interesting to see if SS reveal anything around that at Hot Chips. (And, as usual, my guess is that if SS reveal they're doing it, that means it's both feasible and that Apple are likely already doing it.)

https://reviews.llvm.org/D56572

The author is on team Samsung, which suggests that it's either in M3 or scheduled for M4. Which (I expect) suggests that it's likely already in Apple. (Apple have long maintained their own private LLVM branch which stays in sync with the public branch, but which allows them to make these sorts of changes without the rest of us being able to learn anything :-( )

The immediate payoff from this sort of fusion is you avoid the intermediate register allocation which is already nice and all; but the even bigger win is if you use the fact that ALU logic is so shallow (compared to the harder steps in your pipeline) to double-pump the ALU and do both fused ops in a single cycle. It will be interesting to see if SS reveal anything around that at Hot Chips. (And, as usual, my guess is that if SS reveal they're doing it, that means it's both feasible and that Apple are likely already doing it.)

Topic | Posted By | Date |
---|---|---|

arithmetic/logic op fusion | Maynard Handley | 2019/01/16 05:01 PM |

arithmetic/logic op fusion | Foo_ | 2019/01/17 08:03 AM |

arithmetic/logic op fusion | Ricardo B | 2019/01/17 09:00 AM |

You're right, I had misread (NT) | Foo_ | 2019/01/17 09:43 AM |

arithmetic/logic op fusion | dmcq | 2019/01/17 09:51 AM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/17 11:24 AM |

arithmetic/logic op fusion | dmcq | 2019/01/17 11:58 AM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/17 12:37 PM |

arithmetic/logic op fusion | dmcq | 2019/01/17 06:00 PM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/17 08:42 PM |

arithmetic/logic op fusion | anon | 2019/01/17 01:53 PM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/17 03:16 PM |

arithmetic/logic op fusion | Wilco | 2019/01/17 04:08 PM |

arithmetic/logic op fusion | j | 2019/01/18 02:59 AM |

arithmetic/logic op fusion | anon | 2019/01/18 08:34 AM |

arithmetic/logic op fusion | dmcq | 2019/01/18 08:55 AM |

arithmetic/logic op fusion | Anon | 2019/01/18 10:25 AM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/18 11:29 AM |

arithmetic/logic op fusion | dmcq | 2019/01/18 12:42 PM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/18 01:44 PM |

arithmetic/logic op fusion | dmcq | 2019/01/18 03:08 PM |

arithmetic/logic op fusion | Paul A. Clayton | 2019/01/18 07:15 PM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/19 11:08 AM |

arithmetic/logic op fusion | dmcq | 2019/01/19 12:14 PM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/19 12:53 PM |

Zero register as single use temporary | Paul A. Clayton | 2019/01/19 06:25 PM |

Zero register as single use temporary | Maynard Handley | 2019/01/19 06:56 PM |

AArch64 (and MY6600, I think) hide SP behind the zero register (NT) | Paul A. Clayton | 2019/01/20 03:33 PM |

arithmetic/logic op fusion | j | 2019/01/18 01:00 PM |

arithmetic/logic op fusion | Maynard Handley | 2019/01/18 01:47 PM |

arithmetic/logic op fusion | Wilco | 2019/01/18 02:04 PM |