By: juanrga (nospam.delete@this.juanrga.com), October 5, 2015 9:46 am
Room: Moderated Discussions
Wouter Tinus (wouter.tinus.delete@this.gmail.com) on October 4, 2015 4:18 pm wrote:
> juanrga (nospam.delete@this.juanrga.com) on October 4, 2015 6:30 am wrote:
> > Are you kidding?
>
> No I'm not. Either your reasoning is inconsistent or you're missing some of the facts.
> You started by claiming: "Skylake is 8-wide (unfused uops) like Haswell," yet the former
> can in fact retire twice as many µops per cycle as the latter [1]. So regardless of whether
> you want to count unfused or fused µops, Skylake is wider than Haswell.
>
> If you want to multiply both numbers by two to get unfused µops [2], then you should
> rightly call Skylake a 16-wide core (2 threads x 4 µops x 2). If you don't want to
> do that, then perhaps you should reconsider your definition of wideness :)
>
> [1] http://www.anandtech.com/show/9582/intel-skylake-mobile-desktop-launch-architecture-analysis/5
> [2] I'm not sure if that is fair or accurate, but not judging here
>
You said that Skylake is
> > > - 5 wide decode
> > > - 6 wide allocation/decoder queue
> > > - 6 wide ROB
> > > - 8 wide issue
> > > - 8 wide retire (4/thread)
Using retire as metric, Skylake is then 8-wide. Haswell/Broadwell can also retire up to 8 uops per cycle. Thus both are 8-wide as well.
There is no way that Skylake can issue and retire 16 ops per cycle, and Anandtech don't say the contrary.
> juanrga (nospam.delete@this.juanrga.com) on October 4, 2015 6:30 am wrote:
> > Are you kidding?
>
> No I'm not. Either your reasoning is inconsistent or you're missing some of the facts.
> You started by claiming: "Skylake is 8-wide (unfused uops) like Haswell," yet the former
> can in fact retire twice as many µops per cycle as the latter [1]. So regardless of whether
> you want to count unfused or fused µops, Skylake is wider than Haswell.
>
> If you want to multiply both numbers by two to get unfused µops [2], then you should
> rightly call Skylake a 16-wide core (2 threads x 4 µops x 2). If you don't want to
> do that, then perhaps you should reconsider your definition of wideness :)
>
> [1] http://www.anandtech.com/show/9582/intel-skylake-mobile-desktop-launch-architecture-analysis/5
> [2] I'm not sure if that is fair or accurate, but not judging here
>
You said that Skylake is
> > > - 5 wide decode
> > > - 6 wide allocation/decoder queue
> > > - 6 wide ROB
> > > - 8 wide issue
> > > - 8 wide retire (4/thread)
Using retire as metric, Skylake is then 8-wide. Haswell/Broadwell can also retire up to 8 uops per cycle. Thus both are 8-wide as well.
There is no way that Skylake can issue and retire 16 ops per cycle, and Anandtech don't say the contrary.