By: none (none.delete@this.none.com), January 12, 2021 3:53 am
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on January 12, 2021 3:37 am wrote:
[...]
> That means that in other programs, which do not use bignums, but which contain integer multiplications,
> Apple M1 is able to do twice more multiplications per cycle than Intel or AMD or ARM Cortex-X1.
>
> So this feature, among many others, adds to the ability of M1 to have a higher IPC.
As long as you don't need a 128-bit result, yes. I wonder what workload beyond bignum
requires 2 integer multiplication per 8 instructions.
IMHO bignum is enough to justify having these 2 multipliers but that's because I'm into
computational number theory. But OTOH for bignum, as you said before, a fused mul lo+hi
would have been enough for most uses.
[...]
> That means that in other programs, which do not use bignums, but which contain integer multiplications,
> Apple M1 is able to do twice more multiplications per cycle than Intel or AMD or ARM Cortex-X1.
>
> So this feature, among many others, adds to the ability of M1 to have a higher IPC.
As long as you don't need a 128-bit result, yes. I wonder what workload beyond bignum
requires 2 integer multiplication per 8 instructions.
IMHO bignum is enough to justify having these 2 multipliers but that's because I'm into
computational number theory. But OTOH for bignum, as you said before, a fused mul lo+hi
would have been enough for most uses.