By: juanrga (nomail.delete@this.juanrga.com), April 11, 2021 4:28 am
Room: Moderated Discussions
Chester (lamchester.delete@this.gmail.com) on April 10, 2021 12:57 pm wrote:
> juanrga (nomail.delete@this.juanrga.com) on April 10, 2021 7:58 am wrote:
> > Anon (no.delete@this.spam.com) on April 9, 2021 10:06 am wrote:
> > > juanrga (nomail.delete@this.juanrga.com) on April 9, 2021 3:44 am wrote:
> > > > Cortex A72 achieving 4.2GHz on 7HPC:
> > > >
> > > > https://fuse.wikichip.org/news/2446/tsmc-demonstrates-a-7nm-arm-based-chiplet-design-for-hpc/
> > >
> > > Thanks, very interesting, so:
> > > 1) 4 A72 cores and 6MB L3 taking 27.28mm², so high performance transistors does indeed are
> > > a lot less dense, all those claims about small high perf ARM cores are just bullshit;
> > > 2) 1.375V to reach 4.2GHz, Zen 3 reaches 4.9GHz at that voltage, and the last 200MHz
> > > needed 0.175 extra volts, no way for a CPU like M1 or N1 going much futher than 3GHz.
> >
> > Zen3 uses a more mature process node. Zen2, which was released after the TSMC demo, achieves
> > about 4.35GHz at 1.375V, which is about 4% higher frequency than the demoed A72.
>
> Zen 2 is also a wider/deeper core that targets higher performance.
I wasn't comparing µarchs. I was explaining the effect of process nodes and goals on clocks.
The A72 cores in Graviton run at 2.3GHz. The same cores on an 7HPC node and pushed to the v-limit run at 4.2GHz.
>
> > M1 and N1 don't need to be clocked much higher than 3GHz. Zen3 needs to hit 4.9GHz to match the performance
> > of an M1 at 3.2GHz. And the N1 at 3.3GHz in Ampere beats (integer) the Zen2 core in the EPYC 7742.
>
> Ampere pulls a slight lead over Zen 2 Epyc 7742 in SPEC2017 ST. It loses slightly in LLVM
> compile and SPECJbb (by a lot if qos is concerned), which isn't a good sign for per-core
> perf since Ampere has a core count advantage. Not sure it's safe to say Ampere beats Zen
> 2 here. Also Zen 2 can clock much higher when in a lower core count desktop platform.
And, in the same page where you got he LLVM results, you can find a NAMD bench where Ampere is 30% faster than Rome.
Zen 2 can clock much higher in a lower core count desktop platform because efficiency is sacrificed by performance. The N1 cores implemented in 7HPC node on a lower core chip aimed at high-performance desktops could achieve high clocks as well (maybe around 3.8GHz).
And I remark again that N1 core don't need to be clocked at the same frequency than x86 cores because it has an higher IPC.
> juanrga (nomail.delete@this.juanrga.com) on April 10, 2021 7:58 am wrote:
> > Anon (no.delete@this.spam.com) on April 9, 2021 10:06 am wrote:
> > > juanrga (nomail.delete@this.juanrga.com) on April 9, 2021 3:44 am wrote:
> > > > Cortex A72 achieving 4.2GHz on 7HPC:
> > > >
> > > > https://fuse.wikichip.org/news/2446/tsmc-demonstrates-a-7nm-arm-based-chiplet-design-for-hpc/
> > >
> > > Thanks, very interesting, so:
> > > 1) 4 A72 cores and 6MB L3 taking 27.28mm², so high performance transistors does indeed are
> > > a lot less dense, all those claims about small high perf ARM cores are just bullshit;
> > > 2) 1.375V to reach 4.2GHz, Zen 3 reaches 4.9GHz at that voltage, and the last 200MHz
> > > needed 0.175 extra volts, no way for a CPU like M1 or N1 going much futher than 3GHz.
> >
> > Zen3 uses a more mature process node. Zen2, which was released after the TSMC demo, achieves
> > about 4.35GHz at 1.375V, which is about 4% higher frequency than the demoed A72.
>
> Zen 2 is also a wider/deeper core that targets higher performance.
I wasn't comparing µarchs. I was explaining the effect of process nodes and goals on clocks.
The A72 cores in Graviton run at 2.3GHz. The same cores on an 7HPC node and pushed to the v-limit run at 4.2GHz.
>
> > M1 and N1 don't need to be clocked much higher than 3GHz. Zen3 needs to hit 4.9GHz to match the performance
> > of an M1 at 3.2GHz. And the N1 at 3.3GHz in Ampere beats (integer) the Zen2 core in the EPYC 7742.
>
> Ampere pulls a slight lead over Zen 2 Epyc 7742 in SPEC2017 ST. It loses slightly in LLVM
> compile and SPECJbb (by a lot if qos is concerned), which isn't a good sign for per-core
> perf since Ampere has a core count advantage. Not sure it's safe to say Ampere beats Zen
> 2 here. Also Zen 2 can clock much higher when in a lower core count desktop platform.
And, in the same page where you got he LLVM results, you can find a NAMD bench where Ampere is 30% faster than Rome.
Zen 2 can clock much higher in a lower core count desktop platform because efficiency is sacrificed by performance. The N1 cores implemented in 7HPC node on a lower core chip aimed at high-performance desktops could achieve high clocks as well (maybe around 3.8GHz).
And I remark again that N1 core don't need to be clocked at the same frequency than x86 cores because it has an higher IPC.