By: Anon (no.delete@this.spam.com), August 11, 2022 5:20 pm
Room: Moderated Discussions
--- (---.delete@this.redheron.com) on August 11, 2022 5:43 pm wrote:
> M1 integer load latency is 4 cycles, pointer chasing latency is 3 cycles.
4 cycles at 3GHz is high latency in my book.
> There's
> basically zero scope to shave this if you want both a TLB and a write queue.
Already done.
> Apple has implemented a large number of techniques to reduce instruction latency, from aggressive
> fusion to zero cycle moves and immediates to a variety of zero cycle loads.
Good, but everbody does instruction fusion and move elimination this day, Apple does not implement aggressive forms of latency reduction like 0.5 cycles ALU instructions.
> M1 integer load latency is 4 cycles, pointer chasing latency is 3 cycles.
4 cycles at 3GHz is high latency in my book.
> There's
> basically zero scope to shave this if you want both a TLB and a write queue.
Already done.
> Apple has implemented a large number of techniques to reduce instruction latency, from aggressive
> fusion to zero cycle moves and immediates to a variety of zero cycle loads.
Good, but everbody does instruction fusion and move elimination this day, Apple does not implement aggressive forms of latency reduction like 0.5 cycles ALU instructions.