By: Wilco (wilco.dijkstra.delete@this.ntlworld.com), July 18, 2022 3:01 am
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on July 17, 2022 8:46 pm wrote:
> Wilco (wilco.dijkstra.delete@this.ntlworld.com) on July 17, 2022 3:50 pm wrote:
> > Having a lower frequency and hardly any L3 cache. If you look at the subscores, you
> > can tell that it does well on anything that fits in L2, but actually scores far worse
> > than Altra in memory intensive benchmarks. So is your conclusion that the memory bandwidth
> > was reduced in Altra Max? Or is it simply lacking a properly sized L3?
>
> So, you think 0.4MB L3 per core vs 0.25 MB L3 per core made a huge difference?
>
> Let's me give you another hypothesis: all subsets Altra Max performed worse was
> because it was bandwidth starved and, as usual, when you add threads to an already
> bandwidth starved workload you reduce performance by worsening locality.
>
> If you disagree, how you explain so a little (really litle, 0.15 MB per core
> with each core already having 1MB L2) L3 giving such performance delta?
Ever heard of tipping points or disk thrashing? If you understand that you should be able to get that the same happens when your working set does not fit in your caches, resulting in memory latency going through the roof. This has absolutely nothing to do with bandwidth.
Altra Max achieves far higher bandwidth than EPYC for any number of cores. EPYC is ~2.3 times faster on MCF despite the far lower bandwidth. And Yitian is ~3.6 faster than Altra Max on MCF with a 50% increase in memory bandwidth.
Could it be that big L3 caches actually help, or do you think AMD was stupid to add a huge L3 to their design?
Wilco
> Wilco (wilco.dijkstra.delete@this.ntlworld.com) on July 17, 2022 3:50 pm wrote:
> > Having a lower frequency and hardly any L3 cache. If you look at the subscores, you
> > can tell that it does well on anything that fits in L2, but actually scores far worse
> > than Altra in memory intensive benchmarks. So is your conclusion that the memory bandwidth
> > was reduced in Altra Max? Or is it simply lacking a properly sized L3?
>
> So, you think 0.4MB L3 per core vs 0.25 MB L3 per core made a huge difference?
>
> Let's me give you another hypothesis: all subsets Altra Max performed worse was
> because it was bandwidth starved and, as usual, when you add threads to an already
> bandwidth starved workload you reduce performance by worsening locality.
>
> If you disagree, how you explain so a little (really litle, 0.15 MB per core
> with each core already having 1MB L2) L3 giving such performance delta?
Ever heard of tipping points or disk thrashing? If you understand that you should be able to get that the same happens when your working set does not fit in your caches, resulting in memory latency going through the roof. This has absolutely nothing to do with bandwidth.
Altra Max achieves far higher bandwidth than EPYC for any number of cores. EPYC is ~2.3 times faster on MCF despite the far lower bandwidth. And Yitian is ~3.6 faster than Altra Max on MCF with a 50% increase in memory bandwidth.
Could it be that big L3 caches actually help, or do you think AMD was stupid to add a huge L3 to their design?
Wilco
Topic | Posted By | Date |
---|---|---|
Yitian 710 | anonymous2 | 2021/10/20 08:57 PM |
Yitian 710 | Adrian | 2021/10/21 12:20 AM |
Yitian 710 | Wilco | 2021/10/21 03:47 AM |
Yitian 710 | Rayla | 2021/10/21 05:52 AM |
Yitian 710 | Wilco | 2021/10/21 11:59 AM |
Yitian 710 | anon2 | 2021/10/21 05:16 PM |
Yitian 710 | Wilco | 2022/07/16 12:21 PM |
Yitian 710 | Anon | 2022/07/16 08:22 PM |
Yitian 710 | Rayla | 2022/07/17 09:10 AM |
Yitian 710 | Anon | 2022/07/17 12:04 PM |
Yitian 710 | Rayla | 2022/07/17 12:08 PM |
Yitian 710 | Wilco | 2022/07/17 01:16 PM |
Yitian 710 | Anon | 2022/07/17 01:32 PM |
Yitian 710 | Wilco | 2022/07/17 02:22 PM |
Yitian 710 | Anon | 2022/07/17 02:47 PM |
Yitian 710 | Wilco | 2022/07/17 03:50 PM |
Yitian 710 | Anon | 2022/07/17 08:46 PM |
Yitian 710 | Wilco | 2022/07/18 03:01 AM |
Yitian 710 | Anon | 2022/07/19 11:21 AM |
Yitian 710 | Wilco | 2022/07/19 06:15 PM |
Yitian 710 | Anon | 2022/07/21 01:25 AM |
Yitian 710 | none | 2022/07/21 01:49 AM |
Yitian 710 | Anon | 2022/07/21 03:03 AM |
Yitian 710 | none | 2022/07/21 04:34 AM |
Yitian 710 | James | 2022/07/21 02:29 AM |
Yitian 710 | Anon | 2022/07/21 03:05 AM |
Yitian 710 | Wilco | 2022/07/21 04:31 AM |
Yitian 710 | Anon | 2022/07/21 05:17 AM |
Yitian 710 | Wilco | 2022/07/21 05:33 AM |
Yitian 710 | Anon | 2022/07/21 05:50 AM |
Yitian 710 | Wilco | 2022/07/21 06:07 AM |
Yitian 710 | Anon | 2022/07/21 06:20 AM |
Yitian 710 | Wilco | 2022/07/21 10:02 AM |
Yitian 710 | Anon | 2022/07/21 10:22 AM |
Yitian 710 | Adrian | 2022/07/17 11:09 PM |
Yitian 710 | Wilco | 2022/07/18 01:15 AM |
Yitian 710 | Adrian | 2022/07/18 02:35 AM |
Yitian 710 | Adrian | 2022/07/16 11:19 PM |
Computations on Big Integers | Bill G | 2022/07/25 10:06 PM |
Computations on Big Integers | none | 2022/07/25 11:35 PM |
x86 MUL 64x64 | Eric Fink | 2022/07/26 01:06 AM |
x86 MUL 64x64 | Adrian | 2022/07/26 02:27 AM |
x86 MUL 64x64 | none | 2022/07/26 02:38 AM |
x86 MUL 64x64 | Jörn Engel | 2022/07/26 10:17 AM |
x86 MUL 64x64 | Linus Torvalds | 2022/07/27 10:13 AM |
x86 MUL 64x64 | ⚛ | 2022/07/28 09:40 AM |
x86 MUL 64x64 | Jörn Engel | 2022/07/28 10:18 AM |
More than 3 registers per instruction | -.- | 2022/07/28 07:01 PM |
More than 3 registers per instruction | Anon | 2022/07/28 10:39 PM |
More than 3 registers per instruction | Jörn Engel | 2022/07/28 10:42 PM |
More than 3 registers per instruction | -.- | 2022/07/29 04:31 AM |
Computations on Big Integers | Bill G | 2022/07/26 01:40 AM |
Computations on Big Integers | none | 2022/07/26 02:17 AM |
Computations on Big Integers | Bill G | 2022/07/26 03:52 AM |
Computations on Big Integers | --- | 2022/07/26 09:57 AM |
Computations on Big Integers | Adrian | 2022/07/26 02:53 AM |
Computations on Big Integers | Bill G | 2022/07/26 03:39 AM |
Computations on Big Integers | Adrian | 2022/07/26 04:21 AM |
Computations on Big Integers in Apple AMX Units | Bill G | 2022/07/26 04:28 AM |
Computations on Big Integers in Apple AMX Units | Adrian | 2022/07/26 05:13 AM |
Typo | Adrian | 2022/07/26 05:20 AM |
IEEE binary64 is 53 bits rather than 52. (NT) | Michael S | 2022/07/26 05:34 AM |
IEEE binary64 is 53 bits rather than 52. | Adrian | 2022/07/26 07:32 AM |
IEEE binary64 is 53 bits rather than 52. | Michael S | 2022/07/26 10:02 AM |
IEEE binary64 is 53 bits rather than 52. | Adrian | 2022/07/27 06:58 AM |
IEEE binary64 is 53 bits rather than 52. | none | 2022/07/27 07:14 AM |
IEEE binary64 is 53 bits rather than 52. | Adrian | 2022/07/27 07:55 AM |
Thanks a lot for the link to the article! (NT) | none | 2022/07/27 08:09 AM |
Typo | zArchJon | 2022/07/26 09:51 AM |
Typo | Michael S | 2022/07/26 10:25 AM |
Typo | zArchJon | 2022/07/26 11:52 AM |
Typo | Michael S | 2022/07/26 01:02 PM |
Computations on Big Integers | Michael S | 2022/07/26 05:55 AM |
Computations on Big Integers | Adrian | 2022/07/26 07:59 AM |
IFMA and Division | Bill G | 2022/07/26 04:25 PM |
IFMA and Division | rwessel | 2022/07/26 08:16 PM |
IFMA and Division | Adrian | 2022/07/27 07:25 AM |
Computations on Big Integers | none | 2022/07/27 01:22 AM |
Big integer multiplication with vector IFMA | Bill G | 2022/07/29 01:06 AM |
Big integer multiplication with vector IFMA | Adrian | 2022/07/29 01:35 AM |
Big integer multiplication with vector IFMA | -.- | 2022/07/29 04:32 AM |
Big integer multiplication with vector IFMA | Adrian | 2022/07/29 09:47 PM |
Big integer multiplication with vector IFMA | Anon | 2022/07/30 08:12 AM |
Big integer multiplication with vector IFMA | Adrian | 2022/07/30 09:27 AM |
AVX-512 unfriendly to heter-performance cores | Paul A. Clayton | 2022/07/31 03:20 PM |
AVX-512 unfriendly to heter-performance cores | Anon | 2022/07/31 03:33 PM |
AVX-512 unfriendly to heter-performance cores | anonymou5 | 2022/07/31 05:03 PM |
AVX-512 unfriendly to heter-performance cores | Brett | 2022/07/31 07:26 PM |
AVX-512 unfriendly to heter-performance cores | Adrian | 2022/08/01 01:45 AM |
Why can't E-cores have narrow/slow AVX-512? (NT) | anonymous2 | 2022/08/01 03:37 PM |
Why can't E-cores have narrow/slow AVX-512? | Ivan | 2022/08/02 12:09 AM |
Why can't E-cores have narrow/slow AVX-512? | anonymou5 | 2022/08/02 10:13 AM |
Why can't E-cores have narrow/slow AVX-512? | Dummond D. Slow | 2022/08/02 03:02 PM |
AVX-512 unfriendly to heter-performance cores | Paul A. Clayton | 2022/08/02 01:19 PM |
AVX-512 unfriendly to heter-performance cores | Anon | 2022/08/02 09:09 PM |
AVX-512 unfriendly to heter-performance cores | Adrian | 2022/08/03 12:50 AM |
AVX-512 unfriendly to heter-performance cores | Anon | 2022/08/03 09:15 AM |
AVX-512 unfriendly to heter-performance cores | -.- | 2022/08/03 08:17 PM |
AVX-512 unfriendly to heter-performance cores | Anon | 2022/08/03 09:02 PM |
IFMA: empty promises from Intel as usual | Kent R | 2022/07/29 07:15 PM |
No hype lasts forever | Anon | 2022/07/30 08:06 AM |
Big integer multiplication with vector IFMA | me | 2022/07/30 09:15 AM |
Computations on Big Integers | --- | 2022/07/26 09:48 AM |
Computations on Big Integers | none | 2022/07/27 01:10 AM |
Computations on Big Integers | --- | 2022/07/28 11:43 AM |
Computations on Big Integers | --- | 2022/07/28 06:44 PM |
Computations on Big Integers | dmcq | 2022/07/26 02:27 PM |
Computations on Big Integers | Adrian | 2022/07/27 08:15 AM |
Computations on Big Integers | Brett | 2022/07/27 11:07 AM |
Yitian 710 | Wes Felter | 2021/10/21 12:51 PM |
Yitian 710 | Adrian | 2021/10/21 01:25 PM |
Yitian 710 | Anon | 2021/10/21 06:08 AM |
Strange definition of the word single. (NT) | anon2 | 2021/10/21 05:00 PM |
AMD Epyc uses chiplets. This is why "strange"? | Mark Roulo | 2021/10/21 05:08 PM |
AMD Epyc uses chiplets. This is why "strange"? | anon2 | 2021/10/21 05:34 PM |
Yeah. Blame spec.org, too, though! | Mark Roulo | 2021/10/21 05:58 PM |
Yeah. Blame spec.org, too, though! | anon2 | 2021/10/21 08:07 PM |
Yeah. Blame spec.org, too, though! | Björn Ragnar Björnsson | 2022/07/17 06:23 AM |
Yeah. Blame spec.org, too, though! | Rayla | 2022/07/17 09:13 AM |
Yeah. Blame spec.org, too, though! | Anon | 2022/07/17 12:01 PM |