who is the champ right now?

By: Adrian (a.delete@this.acm.org), May 26, 2022 7:10 am
Andrei F on May 26, 2022 6:16 am wrote:
Adrian on May 26, 2022 6:09 am wrote:
-.- on May 26, 2022 5:23 am wrote:
Michael S on May 26, 2022 3:36 am wrote:
> > > > Sorry for may be naive question, in recent couple of years I was not following.
> > > > How the ST field at the fastest stock frequency (no overcloking) looks by now?
> > > > Say, Apple M1 (Pro/Max/Ultra) vs Zen3 with big cache vs regular
> > > > Zen3 vs Alder Lake vs Rocket/Tiger Lake vs Comet Lake?
> > > > Non-FP scores are of primary interest.
> > >
> > > Not sure why you bring up Intel's older processors, but again, depends on what you measure exactly.
> > >
> > > SPEC2006 is somewhat an industry standard, and you can find a ranking here:
> > > https://www.anandtech.com/bench/CPU-2020/2797 - ranking does include SPECfp,
> > > so you'd have to go to individual scores if you just want SPECInt results.
> > >
> > > The AMD 5800X3D and Intel 12900KS are missing from the list, though
> > > I'd still expect Alder Lake to come out on top for SPEC.
> >
> >
> > SPEC2006 might be an industry standard, but I do not trust at all that list of results from Anandtech.
> >
> > For benchmarks written in high-level languages, the compiler used and the compilation options
> > can be the cause of much higher differences in benchmark results, than the differences in
> > actual achievable CPU performance, which are very small for the top competitors.
> >
> > Only for a few of the CPUs listed by Anandtech it is possible to discover, by diving into the corresponding
> > Anandtech articles, the compiler versions and the compiler options. Even in those cases, the values used do
> > not appear to be optimal or uniform across the reviews separated by long times, of many months or years.
> >
> > It is pretty certain that Apple M1 has a performance that is overestimated on that list, by
> > using a clang version appropriate for it, while the x86 CPUs use also clang, with less appropriate
> > versions and options, instead of using a new enough gcc with the best options.
> >
> >
> >
> >
> >
> This complete utter nonsense. The compiler flags were set as optimally as possible on
> the x86 binaries, and they're all as close as possible to apples to apples. Result sets
> within given articles or database are all with the same versions and settings.

For meaningful results, I would hope that the same compiler versions had been used.

However, before writing the previous message, I have tried to follow the links from the Anandtech graph and search for the compiler versions and options, and more often than not, I could not find them, to verify if this assertion is true.

The links pointed the the reviews published by Anandtech at the launch of those CPUs.

At the time of the older of those reviews there was no compiler version yet, able to tune the optimization for the CPUs of the newest of those reviews.

So if an old clang version has been used for all x86 CPUs, the newer must have been disadvantaged.

I am pretty certain that the Apple clang version that has been used for M1 cannot have been the same as that used for the old x86 CPUs, as it could not have been available at that time.

Maybe the flags have been set "as optimally as possible on the x86 binaries", but I am not convinced. On one example that I have found following the links, the compilation was made for a Haswell, even if the target was an Ice Lake. I do not have much experience with optimization using clang, so maybe optimizing for another CPU does not make any difference in this case, but on gcc specifying the correct CPU would have had effects.

Moreover, the Apple M1 clang cannot be said to be more similar to the x86 clang than to the x86 gcc. For the fairest comparison between x86 and Appple, the best available compiler should have been chosen on each platform, and on x86 I doubt that clang is that.

I understand very well that there is a lot of work to produce such benchmark results, and that it would not be worthwhile to retest all the older systems together with the newest system, in order to use for all of them the newest compiler, which is appropriate for testing the new CPU, while also updating all the old results, to be more accurately comparable.

The Anandtech database serves a useful purpose as it is, but it must be understood that when there are results which are separated by only a few percent, then a retesting with a different compiler or options is very likely to change the order of the CPUs, so the Anandtech order cannot be said to represent with any certainty the SPEC CPU2006 order that would be obtained when applying the benchmark rules, i.e. using the best compiler with the best options, for each CPU.

