By: anon.1 (abc.delete@this.def.com), August 9, 2019 4:30 pm
Room: Moderated Discussions
Alberto (git.delete@this.git.it) on August 9, 2019 2:12 am wrote:
> Maynard Handley (name99.delete@this.name99.org) on August 8, 2019 8:58 pm wrote:
> > Alberto (git.delete@this.git.it) on August 8, 2019 3:42 pm wrote:
> > > Michael S (already5chosen.delete@this.yahoo.com) on August 8, 2019 8:29 am wrote:
> > > > SPECpower_ssj2008 Results:
> > > >
> > > > Lenovo Think System SR655, AMD EPYC 7742 2.25Ghz - 19,149 Overall ssj_ops/watt
> > > >
> > > > And several more results in the same range.
> > > > https://www.spec.org/power_ssj2008/results/res2019q3
> > > >
> > > > The best non-EPYC2 result:
> > > > ASUSTeK RS720-E9-RS8, Intel Xeon Pt 8280L, 2.7 GHz - 14,274 Overall ssj_ops/watt
> > > >
> > > >
> > > > The best EPYC1 result:
> > > > Dell PowerEdge R7425, AMD EPYC 7601, 2.20 GHz - 11,867 Overall ssj_ops/watt
> > > >
> > >
> > >
> > > Impressive results.
> > > Look like they chosen the best 7nm silicon available at TSMC to assemble these SKUs. Basically they
> > > have two times the cores (same clock) with only 40W more power consumption over Naples, not to mention
> > > the Zen 2 core is stronger so it draw more power than plain Zen on the same process and the
> > > level of power hungry off die interconnection is INSANE.
> > >
> > > A great showcase, now AMD has to prove to be able to supply the channels with Epyc. One thing
> > > is to supply limited quantities, another one millions and millions of SKUs.
> > >
> > > Bet an hypothetical very popular cpu ala 7742 could be rated between 280-300W
> > > to have enough silicon for customers scoring an acceptable profit.
> > >
> > > Great show :). Now i want to see the manufacturing/financial side of the story.
> > >
> >
> > Where do you imagine the limitations will be?
> > (a) We know that TSMC 7nm low power process can supply Apple volumes (~200M/year). Of course 7nm high
> > performance is not EXACTLY the same process, but what reason is there to imagine that a machine tuned
> > to deliver 100s of millions per year, and with a year of experience, can't make millions?
> >
> > (b) Why do you imagine yield to be an issue? Obviously Apple yields are acceptable.
> > And sure, AMD could, in some crazy fantasy, be cherry picking the 5% best performing
> > chiplets and tossing the rest. But how does that make ANY business sense?
> > We've seen the spread of AMD's offerings (eg here:
> > https://www.anandtech.com/show/14694/amd-rome-epyc-2nd-gen/4
> > )
> > and one feature that is remarkable is how TIGHTLY clustered the peak frequencies are,
> > and the regular pattern of the base frequencies relative to core count. It looks like
> > the process is delivering what are actually extremely NON-VARIABLE chiplets...
> >
> > If your theory made any sense, at the very least AMD would
> > be selling the golden, 5% chiplets at a substantial
> > premium, while offering a second tier of average chiplets at rather lower frequency and cost.
> >
> > (c) The third gating factor COULD be assembly of the final product. I've seen nothing yet
> > about the packaging of Rome -- the technology or who does the work. But it doesn't seem to
> > be any particular stretch in terms of grossly higher frequencies, pin/trace densities, or
> > any other important metric. Which suggests that it's business as usual for the packaging,
> > and that there's no reason to believe that packaging will limit the numbers AMD can ship.
> >
> > Which leads me to ask, once again, exactly what it is that you imagine will prevent AMD from being
> > able to ship at the very least the same sort of volumes they've shipped over the past year?
> > (Of course if they were too pessimistic in their forecasts, they may have initial
> > temporary glitches meeting demand, simply because, as they said, they didn't expect
> > that their competition over the next year would be so inadequate...)
>
> No they will ship, and very likely even larger quantities than past years.
> My point is that this SKU is in class of its own from manufacturing point of view.
> It is thinked for low/medium volume, not certain for destroy Intel.
>
> After all 7nm process is not a marvel and even on phone SOCs has clearly showed that the
> TSMC claim of half power versus old generation is only a dream for low clocked FPGAs.
>
> Magically here we have nearly half power, a clear indication that this line is
> a showcase, and definitively not a frontal attack to Intel market share. A more
> relaxed TDP could have signified a declaration of war against the Blue Team.
>
> Good for Intel, likely AMD do not want a bloody price war against a so bigger competitor.
>
>
Naples has the same interconnect. It's true that you don't need to traverse the interconnect if you are NUMA-aware, but it may not be the case all the time. A higher IPC core burns more power but also gets more work done. The statistic presented is effectively ops/watt, and as long as the performance/power ratio (process neutral) is larger than 1 (which architects should ensure), this should not be a surprise at all. Moreover, a larger LLC ensures lower activity on the interconnect (apart from IPC boost), so in AMD's case, doubling the L3 cache likely helps power a lot more than say intel.
Also, afaik, the IOD is clocked at a lower rate than the cores. I think what you have a hard time believing is how this architecture with all its "obvious" flaws could be more efficient than intel.
> Maynard Handley (name99.delete@this.name99.org) on August 8, 2019 8:58 pm wrote:
> > Alberto (git.delete@this.git.it) on August 8, 2019 3:42 pm wrote:
> > > Michael S (already5chosen.delete@this.yahoo.com) on August 8, 2019 8:29 am wrote:
> > > > SPECpower_ssj2008 Results:
> > > >
> > > > Lenovo Think System SR655, AMD EPYC 7742 2.25Ghz - 19,149 Overall ssj_ops/watt
> > > >
> > > > And several more results in the same range.
> > > > https://www.spec.org/power_ssj2008/results/res2019q3
> > > >
> > > > The best non-EPYC2 result:
> > > > ASUSTeK RS720-E9-RS8, Intel Xeon Pt 8280L, 2.7 GHz - 14,274 Overall ssj_ops/watt
> > > >
> > > >
> > > > The best EPYC1 result:
> > > > Dell PowerEdge R7425, AMD EPYC 7601, 2.20 GHz - 11,867 Overall ssj_ops/watt
> > > >
> > >
> > >
> > > Impressive results.
> > > Look like they chosen the best 7nm silicon available at TSMC to assemble these SKUs. Basically they
> > > have two times the cores (same clock) with only 40W more power consumption over Naples, not to mention
> > > the Zen 2 core is stronger so it draw more power than plain Zen on the same process and the
> > > level of power hungry off die interconnection is INSANE.
> > >
> > > A great showcase, now AMD has to prove to be able to supply the channels with Epyc. One thing
> > > is to supply limited quantities, another one millions and millions of SKUs.
> > >
> > > Bet an hypothetical very popular cpu ala 7742 could be rated between 280-300W
> > > to have enough silicon for customers scoring an acceptable profit.
> > >
> > > Great show :). Now i want to see the manufacturing/financial side of the story.
> > >
> >
> > Where do you imagine the limitations will be?
> > (a) We know that TSMC 7nm low power process can supply Apple volumes (~200M/year). Of course 7nm high
> > performance is not EXACTLY the same process, but what reason is there to imagine that a machine tuned
> > to deliver 100s of millions per year, and with a year of experience, can't make millions?
> >
> > (b) Why do you imagine yield to be an issue? Obviously Apple yields are acceptable.
> > And sure, AMD could, in some crazy fantasy, be cherry picking the 5% best performing
> > chiplets and tossing the rest. But how does that make ANY business sense?
> > We've seen the spread of AMD's offerings (eg here:
> > https://www.anandtech.com/show/14694/amd-rome-epyc-2nd-gen/4
> > )
> > and one feature that is remarkable is how TIGHTLY clustered the peak frequencies are,
> > and the regular pattern of the base frequencies relative to core count. It looks like
> > the process is delivering what are actually extremely NON-VARIABLE chiplets...
> >
> > If your theory made any sense, at the very least AMD would
> > be selling the golden, 5% chiplets at a substantial
> > premium, while offering a second tier of average chiplets at rather lower frequency and cost.
> >
> > (c) The third gating factor COULD be assembly of the final product. I've seen nothing yet
> > about the packaging of Rome -- the technology or who does the work. But it doesn't seem to
> > be any particular stretch in terms of grossly higher frequencies, pin/trace densities, or
> > any other important metric. Which suggests that it's business as usual for the packaging,
> > and that there's no reason to believe that packaging will limit the numbers AMD can ship.
> >
> > Which leads me to ask, once again, exactly what it is that you imagine will prevent AMD from being
> > able to ship at the very least the same sort of volumes they've shipped over the past year?
> > (Of course if they were too pessimistic in their forecasts, they may have initial
> > temporary glitches meeting demand, simply because, as they said, they didn't expect
> > that their competition over the next year would be so inadequate...)
>
> No they will ship, and very likely even larger quantities than past years.
> My point is that this SKU is in class of its own from manufacturing point of view.
> It is thinked for low/medium volume, not certain for destroy Intel.
>
> After all 7nm process is not a marvel and even on phone SOCs has clearly showed that the
> TSMC claim of half power versus old generation is only a dream for low clocked FPGAs.
>
> Magically here we have nearly half power, a clear indication that this line is
> a showcase, and definitively not a frontal attack to Intel market share. A more
> relaxed TDP could have signified a declaration of war against the Blue Team.
>
> Good for Intel, likely AMD do not want a bloody price war against a so bigger competitor.
>
>
Naples has the same interconnect. It's true that you don't need to traverse the interconnect if you are NUMA-aware, but it may not be the case all the time. A higher IPC core burns more power but also gets more work done. The statistic presented is effectively ops/watt, and as long as the performance/power ratio (process neutral) is larger than 1 (which architects should ensure), this should not be a surprise at all. Moreover, a larger LLC ensures lower activity on the interconnect (apart from IPC boost), so in AMD's case, doubling the L3 cache likely helps power a lot more than say intel.
Also, afaik, the IOD is clocked at a lower rate than the cores. I think what you have a hard time believing is how this architecture with all its "obvious" flaws could be more efficient than intel.