By: Poindexter (cherullo.delete@this.gmail.com), October 31, 2015 2:47 pm
Room: Moderated Discussions
> I am sure the reason for 4ALU+2AGU and 128bit FP pipes is not because it is the best possible
> configuration. The real reason? We only can speculate at this time. Maybe a cache bottleneck did
> make adding a third AGU useless, maybe the fourth ALU is here for symmetry reasons, maybe...
I find it funny that you like to tout pipe numbers, but you never discuss other architectural features that have direct impact in this discussion:
- MOV elimination
- Store-to-load forwarding
- Memory reordering and memory disambiguation
- Instruction fusing
We have absolutely no idea how Zen will fare in this regard. And it's not just whether Zen implements those things or not, how they are implemented is also very important. You also like to tout other server architectures pipeline ratios but:
- Never provided any connection between Haswell's increased IPC over Ivy Bridge to the third AGU.
- You didn't realize that Jaguar can schedule HALF the number of loads that Bulldozer can, and still enjoys comparable IPC (like you already stated in other forums).
- Never discussed whether Zen implements a dedicated memory scheduler, even more aggressive than Jaguar's. It could schedule non-aliasing loads ahead of older, ready to issue stores, reducing the need of a third AGU.
Regarding the FPU, you never mention that Zen's FPU doesn't share ports with the integer ALUs like Haswell does. You never mention that Zen's FPU has more ports and units than Haswell's. You only seem to care about maximum throughput (in the e-penis sense), which frankly, is not that interesting.
Really, there are so many other important factors about this whole discussion that sealing Zen's fate based on Haswell 4+3 > Zen 4+2 is not realistic, it's just simplistic and boring.
> configuration. The real reason? We only can speculate at this time. Maybe a cache bottleneck did
> make adding a third AGU useless, maybe the fourth ALU is here for symmetry reasons, maybe...
I find it funny that you like to tout pipe numbers, but you never discuss other architectural features that have direct impact in this discussion:
- MOV elimination
- Store-to-load forwarding
- Memory reordering and memory disambiguation
- Instruction fusing
We have absolutely no idea how Zen will fare in this regard. And it's not just whether Zen implements those things or not, how they are implemented is also very important. You also like to tout other server architectures pipeline ratios but:
- Never provided any connection between Haswell's increased IPC over Ivy Bridge to the third AGU.
- You didn't realize that Jaguar can schedule HALF the number of loads that Bulldozer can, and still enjoys comparable IPC (like you already stated in other forums).
- Never discussed whether Zen implements a dedicated memory scheduler, even more aggressive than Jaguar's. It could schedule non-aliasing loads ahead of older, ready to issue stores, reducing the need of a third AGU.
Regarding the FPU, you never mention that Zen's FPU doesn't share ports with the integer ALUs like Haswell does. You never mention that Zen's FPU has more ports and units than Haswell's. You only seem to care about maximum throughput (in the e-penis sense), which frankly, is not that interesting.
Really, there are so many other important factors about this whole discussion that sealing Zen's fate based on Haswell 4+3 > Zen 4+2 is not realistic, it's just simplistic and boring.