By: Dummond D. Slow (mental.delete@this.protozoa.us), January 4, 2021 12:11 pm
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on January 2, 2021 2:00 pm wrote:
> Dummond D. Slow (mental.delete@this.protozoa.us) on January 2, 2021 12:50 pm wrote:
> > Adrian (a.delete@this.acm.org) on January 2, 2021 2:45 am wrote:
> > > Adrian (a.delete@this.acm.org) on January 1, 2021 1:28 pm wrote:
> > > >
> > > > Now I have just replaced the 3700X with a 5900X, and ECC
> > > > seems to work OK starting with the Linux kernel 5.10.
> > > >
> > > > However, I have not repeated yet the memory overclocking test with the new CPU, to see if the
> > > > errors are really reported, but I intend to do it again when I will have some spare time.
> > >
> > >
> > >
> > > I think that it is interesting to mention that after finally having direct access
> > > to a Zen 3 CPU, I have verified that is faster in single-thread than Apple M1.
> > >
> > > The Apple M1 was advantaged at launch, because the available benchmarks that could be compared with it
> > > at that time were done poorly on Tiger Lake and Zen 3, making Apple M1 to appear better than it is.
> > >
> > > For example, Tiger Lake @ 4.8 GHz should reach a Geekbench 5 single-thread score
> > > of around 1750, exactly the same as Apple M1 @ 3.2 GHz (up to 1752, many close to
> > > 1750), if we extrapolate from the good scores recorded for Tiger Lake @ 4.7 GHz.
> > >
> > > However all the i7-1185G7 scores from the GB5 database are
> > > much lower, lower than the good scores of the slower
> > > i7-1165G7, so I believe that all the few existing laptop models that have i7-1185G7 suck badly and we will
> > > see the actual speed of Tiger Lake @ 4.8 GHz only when the
> > > Intel NUC with it will be available, in a few months
> > > (even Intel is expected to launch first an Intel NUC with i7-1165G7, unlike in the past when they used only
> > > the top speed for themselves; so they must still have serious yield problems with the top SKU).
> > >
> > >
> > > So for Intel Tiger Lake there is no direct proof yet, about how exactly it is positioned against Apple M1.
> > >
> > > On the other hand, at the Apple M1 launch there were only poor benchmarks
> > > done on Zen 3, so it seemed slower in single-thread than M1.
> > >
> > > Meanwhile many better benchmarks have accumulated for Zen 3 (a lot of GB5 ST
> > > scores over 1800 at the nominal clock frequencies) and I now have my own sample,
> > > so I could verify that the high Zen 3 scores are the correct scores.
> > >
> >
> > Did you run on Linux? (I assume yes).
> > Also, how good was your RAM? I wonder how much that influences the GB5
> > result, given how absurdly variable they are for the same CPUs.
> >
>
>
> Yes, I run Linux and most GB5 Linux scores are indeed typically higher than
> almost all Windows scores, especially for single-thread, even if, for most CPUs,
> there are also a few Windows scores in the range of the Linux scores.
>
> While the real cause for this is unknown, my opinion is that this is due to the lack of
> control in Windows of which programs run on the computer. There are a myriad of Windows
> services that wake up and run whenever they desire, without control from the computer owner
> and also various other extra junk services, e.g. anti-viruses or telemetry programs.
>
> I believe that these Windows services interfere randomly with the single-threaded benchmarks,
> while on a well-configured Linux you may be certain enough that you will have not have undesirable
> daemons running when you do not want them to run. This is consistent with the fact that I have
> also seen a few high Windows scores, with values similar to the Linux scores.
>
> My memory is very slow, DDR4-2667, because the upgraded computer was built in 2019, when there were no
> faster ECC UDIMMs. Now DDR4-3200 ECC UDIMMs have appeared, but I have not upgraded the memory yet.
>
> As it is well known, Ryzen is particularly sensitive to memory speed, because
> the communication links between chiplets run at the memory speed.
>
> I assume that due to the very slow memory, my own GB5 score was only 1790 @ 4.8 GHz (frequency
> from the .gb5 file), while I have seen a lot of other Linux GB5 scores above 1800 and up
> to 1876. Moreover, when comparing with the other higher Zen 3 GB5 scores, many subtests were
> the same, just a few tests were lower on my sample, e.g. the encryption test, and that is
> consistent with the supposition that my score was lowered by the slow memory.
>
> Nevertheless even my score, which is low for a Ryzen 9 5900X under Linux, due to the
> slow memory, was significantly higher than the best Apple M1 score, which is 1752.
>
Interesting, so it was actually with a non-fast RAM (I assume the timings/subtimings were not tweaked to death either, then).
>
> On Ryzen, an extra cause for variability is the core on which GB5 happens to run the tests. I do not know
> how the core is selected, but I do not believe that the best core is chosen, at least on Linux. On my CPU,
> the best 2 or 3 cores have single-thread frequencies between 4.85 and 4.95 GHz, depending on core temperature,
> while the slowest cores have frequencies between 4.75 and 4.85 GHz, so I believe that the GB5 test happened
> to run in my case on one of the slower cores of my CPU, while the fastest GB5 results from the database
> have been run on fast cores, mostly at 4.9 or 4.95 GHz, or even up to 5.05 GHz on a 5950X.
>
> I believe that GB5 pins the tests to a single core, because the clock frequency is constant during the tests,
> but I am not aware if there is any method on Linux to determine automatically which are the fastest cores.
> I know that there is some support for the Intel Turbo 3.0, which works similarly for Intel, but even for Intel
> I do not know if the existing Linux support provides a way to discover which are the fastest cores.
>
> On Linux, with the default scheduler, if you do not pin explicitly a process to a core, it is
> migrated randomly across all cores, so on Ryzen the clock frequency would change whenever passing
> to a new core, because each core might have a different maximum turbo clock frequency.
>
>
>
>
> > IRRC GB5 scores way higher on MacOS and it is also scoring higher on the same hardware under
> > Linux compared to Windows. The reviews of M1 generally compared MacOS result to AMD/Intel
> > result under Windows - a mistake arguably, making M1 looking faster than it really is.
> >
> >
> > > Therefore now it is clear that a desktop Zen 3 is faster in single-thread
> > > than Apple M1 (obviously at a much greater power per core, of over 20 W).
> > >
> > >
> > > For example, in GB5 ST, Zen 3 is faster by about 2% @ 4.8 GHz (e.g. 1790
> > > vs. 1752) and up to about 7% @ 5.05 GHz (over 1850, up to 1876).
> > >
> > > In computational benchmarks where the number and speed of the available execution resources matter
> > > most, unlike in GB5 or SPEC, where the higher *average* IPC of Apple shines, the advantage of Zen
> > > 3 over Apple M1 increases, being e.g. of over 14% @ 4.9 GHz for gmpbench (7337 vs. 6422).
> > >
> >
>
>
> Dummond D. Slow (mental.delete@this.protozoa.us) on January 2, 2021 12:50 pm wrote:
> > Adrian (a.delete@this.acm.org) on January 2, 2021 2:45 am wrote:
> > > Adrian (a.delete@this.acm.org) on January 1, 2021 1:28 pm wrote:
> > > >
> > > > Now I have just replaced the 3700X with a 5900X, and ECC
> > > > seems to work OK starting with the Linux kernel 5.10.
> > > >
> > > > However, I have not repeated yet the memory overclocking test with the new CPU, to see if the
> > > > errors are really reported, but I intend to do it again when I will have some spare time.
> > >
> > >
> > >
> > > I think that it is interesting to mention that after finally having direct access
> > > to a Zen 3 CPU, I have verified that is faster in single-thread than Apple M1.
> > >
> > > The Apple M1 was advantaged at launch, because the available benchmarks that could be compared with it
> > > at that time were done poorly on Tiger Lake and Zen 3, making Apple M1 to appear better than it is.
> > >
> > > For example, Tiger Lake @ 4.8 GHz should reach a Geekbench 5 single-thread score
> > > of around 1750, exactly the same as Apple M1 @ 3.2 GHz (up to 1752, many close to
> > > 1750), if we extrapolate from the good scores recorded for Tiger Lake @ 4.7 GHz.
> > >
> > > However all the i7-1185G7 scores from the GB5 database are
> > > much lower, lower than the good scores of the slower
> > > i7-1165G7, so I believe that all the few existing laptop models that have i7-1185G7 suck badly and we will
> > > see the actual speed of Tiger Lake @ 4.8 GHz only when the
> > > Intel NUC with it will be available, in a few months
> > > (even Intel is expected to launch first an Intel NUC with i7-1165G7, unlike in the past when they used only
> > > the top speed for themselves; so they must still have serious yield problems with the top SKU).
> > >
> > >
> > > So for Intel Tiger Lake there is no direct proof yet, about how exactly it is positioned against Apple M1.
> > >
> > > On the other hand, at the Apple M1 launch there were only poor benchmarks
> > > done on Zen 3, so it seemed slower in single-thread than M1.
> > >
> > > Meanwhile many better benchmarks have accumulated for Zen 3 (a lot of GB5 ST
> > > scores over 1800 at the nominal clock frequencies) and I now have my own sample,
> > > so I could verify that the high Zen 3 scores are the correct scores.
> > >
> >
> > Did you run on Linux? (I assume yes).
> > Also, how good was your RAM? I wonder how much that influences the GB5
> > result, given how absurdly variable they are for the same CPUs.
> >
>
>
> Yes, I run Linux and most GB5 Linux scores are indeed typically higher than
> almost all Windows scores, especially for single-thread, even if, for most CPUs,
> there are also a few Windows scores in the range of the Linux scores.
>
> While the real cause for this is unknown, my opinion is that this is due to the lack of
> control in Windows of which programs run on the computer. There are a myriad of Windows
> services that wake up and run whenever they desire, without control from the computer owner
> and also various other extra junk services, e.g. anti-viruses or telemetry programs.
>
> I believe that these Windows services interfere randomly with the single-threaded benchmarks,
> while on a well-configured Linux you may be certain enough that you will have not have undesirable
> daemons running when you do not want them to run. This is consistent with the fact that I have
> also seen a few high Windows scores, with values similar to the Linux scores.
>
> My memory is very slow, DDR4-2667, because the upgraded computer was built in 2019, when there were no
> faster ECC UDIMMs. Now DDR4-3200 ECC UDIMMs have appeared, but I have not upgraded the memory yet.
>
> As it is well known, Ryzen is particularly sensitive to memory speed, because
> the communication links between chiplets run at the memory speed.
>
> I assume that due to the very slow memory, my own GB5 score was only 1790 @ 4.8 GHz (frequency
> from the .gb5 file), while I have seen a lot of other Linux GB5 scores above 1800 and up
> to 1876. Moreover, when comparing with the other higher Zen 3 GB5 scores, many subtests were
> the same, just a few tests were lower on my sample, e.g. the encryption test, and that is
> consistent with the supposition that my score was lowered by the slow memory.
>
> Nevertheless even my score, which is low for a Ryzen 9 5900X under Linux, due to the
> slow memory, was significantly higher than the best Apple M1 score, which is 1752.
>
Interesting, so it was actually with a non-fast RAM (I assume the timings/subtimings were not tweaked to death either, then).
>
> On Ryzen, an extra cause for variability is the core on which GB5 happens to run the tests. I do not know
> how the core is selected, but I do not believe that the best core is chosen, at least on Linux. On my CPU,
> the best 2 or 3 cores have single-thread frequencies between 4.85 and 4.95 GHz, depending on core temperature,
> while the slowest cores have frequencies between 4.75 and 4.85 GHz, so I believe that the GB5 test happened
> to run in my case on one of the slower cores of my CPU, while the fastest GB5 results from the database
> have been run on fast cores, mostly at 4.9 or 4.95 GHz, or even up to 5.05 GHz on a 5950X.
>
> I believe that GB5 pins the tests to a single core, because the clock frequency is constant during the tests,
> but I am not aware if there is any method on Linux to determine automatically which are the fastest cores.
> I know that there is some support for the Intel Turbo 3.0, which works similarly for Intel, but even for Intel
> I do not know if the existing Linux support provides a way to discover which are the fastest cores.
>
> On Linux, with the default scheduler, if you do not pin explicitly a process to a core, it is
> migrated randomly across all cores, so on Ryzen the clock frequency would change whenever passing
> to a new core, because each core might have a different maximum turbo clock frequency.
>
>
>
>
> > IRRC GB5 scores way higher on MacOS and it is also scoring higher on the same hardware under
> > Linux compared to Windows. The reviews of M1 generally compared MacOS result to AMD/Intel
> > result under Windows - a mistake arguably, making M1 looking faster than it really is.
> >
> >
> > > Therefore now it is clear that a desktop Zen 3 is faster in single-thread
> > > than Apple M1 (obviously at a much greater power per core, of over 20 W).
> > >
> > >
> > > For example, in GB5 ST, Zen 3 is faster by about 2% @ 4.8 GHz (e.g. 1790
> > > vs. 1752) and up to about 7% @ 5.05 GHz (over 1850, up to 1876).
> > >
> > > In computational benchmarks where the number and speed of the available execution resources matter
> > > most, unlike in GB5 or SPEC, where the higher *average* IPC of Apple shines, the advantage of Zen
> > > 3 over Apple M1 increases, being e.g. of over 14% @ 4.9 GHz for gmpbench (7337 vs. 6422).
> > >
> >
>
>