By: anon (anon.delete@this.ymous.com), August 19, 2016 7:29 am
Room: Moderated Discussions
> http://www.realworldtech.com/haswell-cpu/3/ says that Haswell's OOO scheduling window is around 300 operations,
> based on a ROB of 192 µops, but Haswell being able to fuse µops before they enter the OOO engine.
>
> The exact window depends on your instruction mix, but hundreds of instructions
> is the scale you're looking at on a modern desktop or server processor.
>
> I'm not entirely clear on what code constructs to avoid with OOO, beyond the obvious of keeping
> your dependency chains under control - OOO can't help you if, within the OOO window, it can't find
> enough µops to execute. OTOH, compilers are now pretty good at helping you out here, so...
Haswell's scheduler can contain 60 uops (I'm not clear whether fused uops are still a single uop in the scheduler, but I think fused macroops are), so you can pick uops from a pool of 60, not 200-300.
Skylake is around 90 uops.
It's still a large number but there is a big difference imho.