Amazon's special Skylake Sauce

By: Travis Downs (travis.downs.delete@this.gmail.com), November 25, 2018 2:50 pm
Room: Moderated Discussions
The SKX chip Amazon is using for their EC2 c5 instances is a bit mysterious: the Xeon 8124M. You can't find much information about it.

I decided to measure the turbo frequencies for all core counts from 1 to 18 (actually from 1 to 36 because I rented the full thing, which is a two socket system, but I wouldn't expect cores on different sockets to affect the total speed).

I found something pretty interesting:

The chip exhibits zero turbo frequency scaling with active core count!

That is, the single-core turbo frequencies, are exactly the same as the turbo frequencies when all 18 cores are running flat out.

In particular, the chip always runs at 3.4/3.3/2.9 GHz for the L0, L1 and L2 licenses, which respectively correspond to scalar/128-bit/light 256-bit, heavy 256/light 512, heavy 512.

That's interesting because you won't find any retail chip like that - they all exhibit some lower turbo frequencies once you go past two active cores, at least for some license levels, and the scaling is usually significant at higher core counts, especially for heavy 512-bit code.

In fact, the 2.9 GHz L2/AVX-512 speed makes this chip the speed champion for heavy AVX-512 code at 18 cores: no chip runs 18 cores faster. The closest competitor is the Gold 6154 which runs at 2.7 GHz.

Overall, despite the "8" prefix indicating it might be in the platinum family, this chip most resembles that Gold 6154 chip. That chip hast the name number of cores and no core-count scaling for scalar code at 3.7 GHz constantly, but scales a bit for L1 (flatlines 3.3 GHz for 5+ cores, very similar to Amazon's chip), but goes down to 2.7 GHz for L2/heavy AVX-512. The TDP of that chip is 205W while Amazon's is reported to be 240W (but I don't know how reliable that figure is). Amazon's virtualization doesn't seem to export the various power-related MSRs for these c5 instances (but they do for others), so I can't read those.

Using a chip like this makes a lot of sense: it means that the noisy neighbor problem at least doens't apply for core frequencies, so you can expect stable CPU frequencies regardless of what your co-tenants are doing. This is definitely not true for earlier instance types, where I observe a lot of fluctuation in the turbo ratios.

I anyone wants to run this on other cloud platforms (or anywhere), the code is on github. Results for non-retail chips are of course especially interesting. The output is a bit voluminous for high core counts, if you paste it somewhere I can interpret it...
 Next Post in Thread >
TopicPosted ByDate
Amazon's special Skylake SauceTravis Downs2018/11/25 02:50 PM
  Amazon's special Skylake Saucejuanrga2018/11/26 04:14 AM
    Amazon's special Skylake SauceTravis Downs2018/11/26 05:25 AM
      Amazon's special Skylake Sauceanonymou52018/11/26 07:46 AM
        Amazon's special Skylake SauceTravis Downs2018/11/27 11:33 AM
  Amazon's special Skylake Sauceanonymous22018/11/26 05:10 PM
    Amazon's special Skylake Sauceanonymou52018/11/26 08:10 PM
    Amazon's special Skylake SauceDavid Hess2018/11/27 07:47 AM
    Amazon's special Skylake SauceGabriele Svelto2018/11/27 08:35 AM
      Amazon's special Skylake SauceTravis Downs2018/11/27 11:39 AM
  Amazon's special ARM SauceAdrian2018/11/26 10:15 PM
    Amazon's special ARM Saucevvid2018/11/27 12:00 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?