By: Maynard Handley (name99.delete@this.name99.org), July 5, 2015 6:01 pm
Room: Moderated Discussions
Some time ago we were wondering what macro-op fusion exists in Cyclone.
An interesting recent LLVM checkin answers the question to some extent.
http://reviews.llvm.org/D10746
Compare followed by branch is fused (no surprise).
Also in "Cyclone B0" (I'm assuming that means one of the two clusters. It could conceivably mean Typhoon, but you'd figure that would be "Cyclone B1" or "Cyclone B2" if it were anything) arithmetics followed by a CBZ, CBNZ are fused, so essentially the full arithmetic, test, jump in one go.
Interestingly I did not see any OTHER fusion possibilities in the code. In particular the possibility IBM selected (fusing instructions to create a large immediate) is not utilized, which might, of course reflect not enough time to add this to the design; but maybe also reflects something about ARM's constant generation and so a less frequent need for generating immediates through successive instructions.
An interesting recent LLVM checkin answers the question to some extent.
http://reviews.llvm.org/D10746
Compare followed by branch is fused (no surprise).
Also in "Cyclone B0" (I'm assuming that means one of the two clusters. It could conceivably mean Typhoon, but you'd figure that would be "Cyclone B1" or "Cyclone B2" if it were anything) arithmetics followed by a CBZ, CBNZ are fused, so essentially the full arithmetic, test, jump in one go.
Interestingly I did not see any OTHER fusion possibilities in the code. In particular the possibility IBM selected (fusing instructions to create a large immediate) is not utilized, which might, of course reflect not enough time to add this to the design; but maybe also reflects something about ARM's constant generation and so a less frequent need for generating immediates through successive instructions.