By: Maynard Handley (name99.delete@this.name99.org), October 1, 2015 8:57 am
Room: Moderated Discussions
Alberto (git.delete@this.git.it) on October 1, 2015 1:13 am wrote:
> Maynard Handley (name99.delete@this.name99.org) on September 30, 2015 4:30 pm wrote:
> > Wouter Tinus (wouter.tinus.delete@this.gmail.com) on September 30, 2015 3:14 pm wrote:
> > > It seems easy to argue that Skylake is a 5-wide or even 6-wide machine.
> > >
> > > - 5 wide decode
> > > - 6 wide allocation/decoder queue
> > > - 6 wide ROB
> > > - 8 wide issue
> > > - 8 wide retire (4/thread)
> > >
> > > Though Haswell already added extra two extra issue ports, this the first real increase in width
> > > since the introduction of Merom back in 2006. Yet they didn't even bother to mention it at IDF :(
> >
> > I agree it's weird, but it doesn't seem to have bought them very much in performance
> > (so maybe that's why they kept it quiet, to avoid unrealistic expectations?)
> >
> > http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/9
> >
> > Is it possible that they switched to something like two
> > 3-wide execution clusters, and they're losing whatever
> > they should have gained in cluster communication? But clustering seems a very un-Intel direction...
> >
> > Another possibility is what I suggested when Skylake first came out: that for Skylake Intel deliberately
> > made choices that are sub-optimal for IPC, but allow higher frequency to be sustained for longer. So it's
> > somewhat unfair, say, to compare 3GHz Broadwell with 3GHz Skylake, the real comparison out to be something
> > like "amount of work done per second at equal power for the same sort of level of chip".
> > Those numbers are all over the place:
> > http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/17
> > with 91W Skyake against 88W Broadwell sometimes behind (WinRAR,
> > Sunspider, WebXPRT) sometimes 25% ahead(Octane).
> >
> > The spread in results seems to tell us that
> > - there has been a substantial change in the micro-architecture BUT
> > - that change seems rather "fragile", in that it may be parameterized
> > to maximize a weighted basket of benchmarks,
> > but the changes are no longer unequivocally a good idea
> > for all (or even for 95%) of workloads; we're getting
> > close to the territory of improves "60% of workloads by 3% and harms the other 40% by 2%".
>
> Come on :). Every new arc. has upsides and downsides, differences of +3% or -2% are pretty standard.
> Even the very crap GeekB....!!! shows a solid 8% boost in IPC on Arstechnica over Haswell. I pretty
> believe in the +10% claimed by Intel in a session of SPEC 2006 without libquantum obviously.
> Don't try to find unexistent defects in a cpu, this is not respectful of the cpu team.
Well if there is one person who knows all about according respect to CPU teams, it is Alberto: "About your loved A9 SOC, IMO is mediocre (it's only a stockpiling of cache banks)"...
The comparison of interest was to Broadwell, not to Haswell, not least to try to strip out the effects of process.
> Kudos to Intel to have archived this without rising the L2 size to 1MB or 3MB, allowing easy(!)
> and efficent multicore SKUs for server with a smaller footprint and power consumption.
>
>
> Maynard Handley (name99.delete@this.name99.org) on September 30, 2015 4:30 pm wrote:
> > Wouter Tinus (wouter.tinus.delete@this.gmail.com) on September 30, 2015 3:14 pm wrote:
> > > It seems easy to argue that Skylake is a 5-wide or even 6-wide machine.
> > >
> > > - 5 wide decode
> > > - 6 wide allocation/decoder queue
> > > - 6 wide ROB
> > > - 8 wide issue
> > > - 8 wide retire (4/thread)
> > >
> > > Though Haswell already added extra two extra issue ports, this the first real increase in width
> > > since the introduction of Merom back in 2006. Yet they didn't even bother to mention it at IDF :(
> >
> > I agree it's weird, but it doesn't seem to have bought them very much in performance
> > (so maybe that's why they kept it quiet, to avoid unrealistic expectations?)
> >
> > http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/9
> >
> > Is it possible that they switched to something like two
> > 3-wide execution clusters, and they're losing whatever
> > they should have gained in cluster communication? But clustering seems a very un-Intel direction...
> >
> > Another possibility is what I suggested when Skylake first came out: that for Skylake Intel deliberately
> > made choices that are sub-optimal for IPC, but allow higher frequency to be sustained for longer. So it's
> > somewhat unfair, say, to compare 3GHz Broadwell with 3GHz Skylake, the real comparison out to be something
> > like "amount of work done per second at equal power for the same sort of level of chip".
> > Those numbers are all over the place:
> > http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/17
> > with 91W Skyake against 88W Broadwell sometimes behind (WinRAR,
> > Sunspider, WebXPRT) sometimes 25% ahead(Octane).
> >
> > The spread in results seems to tell us that
> > - there has been a substantial change in the micro-architecture BUT
> > - that change seems rather "fragile", in that it may be parameterized
> > to maximize a weighted basket of benchmarks,
> > but the changes are no longer unequivocally a good idea
> > for all (or even for 95%) of workloads; we're getting
> > close to the territory of improves "60% of workloads by 3% and harms the other 40% by 2%".
>
> Come on :). Every new arc. has upsides and downsides, differences of +3% or -2% are pretty standard.
> Even the very crap GeekB....!!! shows a solid 8% boost in IPC on Arstechnica over Haswell. I pretty
> believe in the +10% claimed by Intel in a session of SPEC 2006 without libquantum obviously.
> Don't try to find unexistent defects in a cpu, this is not respectful of the cpu team.
Well if there is one person who knows all about according respect to CPU teams, it is Alberto: "About your loved A9 SOC, IMO is mediocre (it's only a stockpiling of cache banks)"...
The comparison of interest was to Broadwell, not to Haswell, not least to try to strip out the effects of process.
> Kudos to Intel to have archived this without rising the L2 size to 1MB or 3MB, allowing easy(!)
> and efficent multicore SKUs for server with a smaller footprint and power consumption.
>
>