By: --- (---.delete@this.redheron.com), December 5, 2021 11:45 am
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on December 5, 2021 2:54 am wrote:
> dmcq (dmcq.delete@this.fano.co.uk) on December 5, 2021 1:43 am wrote:
> > Adrian (a.delete@this.acm.org) on December 5, 2021 12:15 am wrote:
> > > Rayla (rayla.delete@this.example.com) on December 4, 2021 1:20 pm wrote:
> > > >
> > > > What slide shows that it only has one 256b SVE pipe? All they say is that they have
> > > > 256b SVE - which is in line with what the V1 has. It's not different from Intel saying
> > > > that ICL and SKL-SP have 512b AVX, despite having multiple 512b datapaths.
> > >
> > >
> > > I agree that the slide is ambiguous, but that area in the diagram matched
> > > the same area in the ARM diagram, except that the "2x" was removed.
> > >
> > > Also, this was not my own interpretation, but of Timothy Prickett Morgan from
> > > NextPlatform, who asked Amazon about this, but did not receive any reply yet.
> > >
> > > If Graviton 3 would have indeed a complete V1, running 2 x 256-bit FMA at 2.6 GHz
> > > with only a power consumption of 1.0 ... 1.25 W per core, that would be a huge improvement
> > > in energy efficiency over the existing CPUs, which does not seem likely.
> >
> > It has four SIMD units so I think it is practically definitely a V1 2x256 bit
> > SVE chip. There just wouldn't be any point in doing anything else. The cores
> > would be run at 2.6GHz and with 5nm that would cut the power down greatly.
> >
> >
>
>
> It would be nice if the 5-nm TSMC process would allow such a great reduction in power consumption,
> because if Graviton 3 uses full V1 cores, that means that a V1 in 5 nm can match the performance
> of an AMD Milan in 7 nm at less than half of the per core power consumption.
>
> When doing SVE/AVX 2 x 256 bit FMA, more than half of the power consumption is just in the
> FPU (about 60% for Intel/AMD, while for ARM the proportion should be greater, since the rest
> of the core is simpler), so the architecture of the CPU should matter much less for this limit
> case than the transistor characteristics determined by the manufacturing process.
Not necessarily.
G3 is designed to run at 3GHz, and the transistors are appropriately specced.
AMD and Intel both insist on their cores being able to execute single threaded at much higher GHz.
(a) obviously the cost of any particular operation is a lot more at higher GHz. No surprise there. Is that 60% number at 5GHz or at 3GHz? And 60% of "core" power (ignoring L2/L3), of "SoC" power (ignoring DRAM) or total power.
(b) something that I cannot justify with any simple explanation, but which seems to be a practical reality, is that GHz stretch designs land up burning a lot of power even when they're not running at that stretch GHz. I assume this is some combination of circuit techniques required to hit the frequencies, the particular tuning of the transistors, and the actual digital paths chosen (few levels of logic, even if the result is more paths tha burn current).
There's also the fact that AMD and Intel have chosen to have the back-to-back latency of a fair number of their SIMD ops be one cycle. ARM (even Apple) have chosen to make that latency a minimum of two cycles. One can argue about why each made that decision and the circumstances for which it is optimal, but such a choice further exacerbates all the issues I raised about high frequencty, and further ties into ARM/Apple being able to provide wide SIMD at reasonable power as opposed to x86's choice of a fireball SIMD unit.
> dmcq (dmcq.delete@this.fano.co.uk) on December 5, 2021 1:43 am wrote:
> > Adrian (a.delete@this.acm.org) on December 5, 2021 12:15 am wrote:
> > > Rayla (rayla.delete@this.example.com) on December 4, 2021 1:20 pm wrote:
> > > >
> > > > What slide shows that it only has one 256b SVE pipe? All they say is that they have
> > > > 256b SVE - which is in line with what the V1 has. It's not different from Intel saying
> > > > that ICL and SKL-SP have 512b AVX, despite having multiple 512b datapaths.
> > >
> > >
> > > I agree that the slide is ambiguous, but that area in the diagram matched
> > > the same area in the ARM diagram, except that the "2x" was removed.
> > >
> > > Also, this was not my own interpretation, but of Timothy Prickett Morgan from
> > > NextPlatform, who asked Amazon about this, but did not receive any reply yet.
> > >
> > > If Graviton 3 would have indeed a complete V1, running 2 x 256-bit FMA at 2.6 GHz
> > > with only a power consumption of 1.0 ... 1.25 W per core, that would be a huge improvement
> > > in energy efficiency over the existing CPUs, which does not seem likely.
> >
> > It has four SIMD units so I think it is practically definitely a V1 2x256 bit
> > SVE chip. There just wouldn't be any point in doing anything else. The cores
> > would be run at 2.6GHz and with 5nm that would cut the power down greatly.
> >
> >
>
>
> It would be nice if the 5-nm TSMC process would allow such a great reduction in power consumption,
> because if Graviton 3 uses full V1 cores, that means that a V1 in 5 nm can match the performance
> of an AMD Milan in 7 nm at less than half of the per core power consumption.
>
> When doing SVE/AVX 2 x 256 bit FMA, more than half of the power consumption is just in the
> FPU (about 60% for Intel/AMD, while for ARM the proportion should be greater, since the rest
> of the core is simpler), so the architecture of the CPU should matter much less for this limit
> case than the transistor characteristics determined by the manufacturing process.
Not necessarily.
G3 is designed to run at 3GHz, and the transistors are appropriately specced.
AMD and Intel both insist on their cores being able to execute single threaded at much higher GHz.
(a) obviously the cost of any particular operation is a lot more at higher GHz. No surprise there. Is that 60% number at 5GHz or at 3GHz? And 60% of "core" power (ignoring L2/L3), of "SoC" power (ignoring DRAM) or total power.
(b) something that I cannot justify with any simple explanation, but which seems to be a practical reality, is that GHz stretch designs land up burning a lot of power even when they're not running at that stretch GHz. I assume this is some combination of circuit techniques required to hit the frequencies, the particular tuning of the transistors, and the actual digital paths chosen (few levels of logic, even if the result is more paths tha burn current).
There's also the fact that AMD and Intel have chosen to have the back-to-back latency of a fair number of their SIMD ops be one cycle. ARM (even Apple) have chosen to make that latency a minimum of two cycles. One can argue about why each made that decision and the circumstances for which it is optimal, but such a choice further exacerbates all the issues I raised about high frequencty, and further ties into ARM/Apple being able to provide wide SIMD at reasonable power as opposed to x86's choice of a fireball SIMD unit.
Topic | Posted By | Date |
---|---|---|
Some info about the Amazon Graviton 3 | Adrian | 2021/12/03 06:51 AM |
Some info about the Amazon Graviton 3 | Kara | 2021/12/03 07:01 AM |
Some info about the Amazon Graviton 3 | --- | 2021/12/03 10:03 AM |
Some info about the Amazon Graviton 3 | Kara | 2021/12/03 10:45 AM |
Some info about the Amazon Graviton 3 | Kara | 2021/12/03 07:05 AM |
Some info about the Amazon Graviton 3 | none | 2021/12/03 07:19 AM |
Some info about the Amazon Graviton 3 | Kara | 2021/12/03 07:36 AM |
N2, or V1? | Anon | 2021/12/03 07:52 AM |
N2, or V1? | Adrian | 2021/12/03 09:47 AM |
N2, or V1? | Adrian | 2021/12/03 09:52 AM |
N2, or V1? | G | 2021/12/03 10:25 AM |
N2, or V1? | Adrian | 2021/12/03 11:51 AM |
N2, or V1? | Wilco | 2021/12/03 02:58 PM |
N2, or V1? | Adrian | 2021/12/04 03:33 AM |
N2, or V1? | -.- | 2021/12/04 04:37 AM |
N2, or V1? | Rayla | 2021/12/04 02:20 PM |
N2, or V1? | Adrian | 2021/12/05 01:15 AM |
N2, or V1? | dmcq | 2021/12/05 02:43 AM |
N2, or V1? | Adrian | 2021/12/05 03:54 AM |
N2, or V1? | --- | 2021/12/05 11:45 AM |
N2, or V1? | Adrian | 2021/12/05 01:07 PM |
Other (minor) power factors? | Paul A. Clayton | 2021/12/06 07:37 AM |
N2, or V1? | Anon | 2021/12/04 10:53 PM |
N2, or V1? | Andrei F | 2021/12/05 04:22 AM |
Only 4 ALUs | Jörn Engel | 2021/12/03 07:37 PM |
Only 4 ALUs | Wilco | 2021/12/04 09:54 AM |
N2, or V1? | -.- | 2022/05/24 06:34 AM |
Graviton3 on Chip &Cheese | Per Hesselgren | 2022/06/17 06:19 AM |