By: none (none.delete@this.none.com), February 4, 2013 4:28 pm
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on February 4, 2013 1:39 pm wrote:
> Patrick Chase (patrickjchase.delete@this.gmail.com) on February 4, 2013 11:55 am wrote:
> > Patrick Chase (patrickjchase.delete@this.gmail.com) on February 4, 2013 11:47 am wrote:
> > > Michael S (already5chosen.delete@this.yahoo.com) on February 4, 2013 2:27 am wrote:
> > > > Patrick Chase (patrickjchase.delete@this.gmail.com) on February 3, 2013 2:27 pm wrote:
> > > > >
> > > > > With that said, you gave me plenty of evidence yourself. R10K was >50% bigger for a 10% advantage
> > > > > in integer and a 50% advantage in FP.
> > > >
> > > > 50% in FP is for Spec95, I assume.
> > > > In Spec2k the difference was ALOT bigger - 2.4x.
> > > > I wonder why Spec2k and spec95 produce so different pictures?
> > >
> > > Two words: Cache footprint.
> > >
> > > SpecFP95 had a notoriously small working set. SpecFP2k was better in that respect. The R14K you cite had
> > > an 8 MB external last level cache, vs. 512 KB for the PIII. That big cache gave R14K quite a significant
> > > benefit in SpecFP2k and similar technical workloads, which is precisely why SGI put it there :-).
> >
> > There is also an issue of external DRAM bandwidth. The PIII-500's chipsets used a single 64-bit SDR SDRAM
> > channel if I recall correctly. Peak STREAM bandwidth was on the order of a couple hundred MiB/sec.
> >
>
> Slightly more:
> Intel_440BX_600, ncpus=1 - 342.2/340.2/412.0/409.2
> http://www.cs.virginia.edu/stream/stream_mail/1999/0035.html
>
> > The Origin used 128-bit DDR per node (it's a NUMA), so it would have had ~4X the bandwidth
> > to memory on even a single node. Peak STREAM bandwidth was close to 1 GiB/sec.
>
> Unfortunately, I can't find single-CPU STREAM result for Origin 3200.
> The previous Origin generation is not very good in single or dual CPU mode:
> SGI_Origin2000-300, ncpus=1 - 336.0/334.0/387.0/388.0
> SGI_Origin2000-300, ncpus=2 - 383.0/373.0/414.0/422.0
> SGI_Origin2000-300, ncpus=4 - 759.0/754.0/852.0/854.0
>
> The smallest Origin3k on official site is a quad:
> SGI_Origin3800-400, ncpus=4 - 1400.6/1403.1/1551.5/1574.3
>
> 4-cpu score is twice higher than Origin2000-300, but I am not sure
> that we can conclude that the same ratio applies to a single CPU.
Page 9 of http://www.sgi.co.jp/origin/ODP/documents/products/performance/o3k600/3000_600perfrep3.pdf
1 core is 733 / 673 / 766 / 773.
> Patrick Chase (patrickjchase.delete@this.gmail.com) on February 4, 2013 11:55 am wrote:
> > Patrick Chase (patrickjchase.delete@this.gmail.com) on February 4, 2013 11:47 am wrote:
> > > Michael S (already5chosen.delete@this.yahoo.com) on February 4, 2013 2:27 am wrote:
> > > > Patrick Chase (patrickjchase.delete@this.gmail.com) on February 3, 2013 2:27 pm wrote:
> > > > >
> > > > > With that said, you gave me plenty of evidence yourself. R10K was >50% bigger for a 10% advantage
> > > > > in integer and a 50% advantage in FP.
> > > >
> > > > 50% in FP is for Spec95, I assume.
> > > > In Spec2k the difference was ALOT bigger - 2.4x.
> > > > I wonder why Spec2k and spec95 produce so different pictures?
> > >
> > > Two words: Cache footprint.
> > >
> > > SpecFP95 had a notoriously small working set. SpecFP2k was better in that respect. The R14K you cite had
> > > an 8 MB external last level cache, vs. 512 KB for the PIII. That big cache gave R14K quite a significant
> > > benefit in SpecFP2k and similar technical workloads, which is precisely why SGI put it there :-).
> >
> > There is also an issue of external DRAM bandwidth. The PIII-500's chipsets used a single 64-bit SDR SDRAM
> > channel if I recall correctly. Peak STREAM bandwidth was on the order of a couple hundred MiB/sec.
> >
>
> Slightly more:
> Intel_440BX_600, ncpus=1 - 342.2/340.2/412.0/409.2
> http://www.cs.virginia.edu/stream/stream_mail/1999/0035.html
>
> > The Origin used 128-bit DDR per node (it's a NUMA), so it would have had ~4X the bandwidth
> > to memory on even a single node. Peak STREAM bandwidth was close to 1 GiB/sec.
>
> Unfortunately, I can't find single-CPU STREAM result for Origin 3200.
> The previous Origin generation is not very good in single or dual CPU mode:
> SGI_Origin2000-300, ncpus=1 - 336.0/334.0/387.0/388.0
> SGI_Origin2000-300, ncpus=2 - 383.0/373.0/414.0/422.0
> SGI_Origin2000-300, ncpus=4 - 759.0/754.0/852.0/854.0
>
> The smallest Origin3k on official site is a quad:
> SGI_Origin3800-400, ncpus=4 - 1400.6/1403.1/1551.5/1574.3
>
> 4-cpu score is twice higher than Origin2000-300, but I am not sure
> that we can conclude that the same ratio applies to a single CPU.
Page 9 of http://www.sgi.co.jp/origin/ODP/documents/products/performance/o3k600/3000_600perfrep3.pdf
1 core is 733 / 673 / 766 / 773.