Tiger Lake performance profile

By: Andrei F (andrei.delete@this.anandtech.com), September 21, 2020 5:50 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on September 21, 2020 1:38 am wrote:
> Travis Downs (travis.downs.delete@this.gmail.com) on September 20, 2020 5:34 pm wrote:
> > Michael S (already5chosen.delete@this.yahoo.com) on September 20, 2020 10:02 am wrote:
> > > Travis Downs (travis.downs.delete@this.gmail.com) on September 19, 2020 8:26 pm wrote:
> > > > Andrei F (andrei.delete@this.anandtech.com) on September 18, 2020 1:04 am wrote:
> > > > > anon (anon.delete@this.anon.com) on September 17, 2020 7:10 pm wrote:
> > > > > > AnandTech's (SPEC ST performance) review is here: anandtech.com/show/16084/intel-tiger-lake-review-deep-dive-core-11th-gen/8
> > > > > > However not all is good: TigerLake
> > > > > > experiences a noticeable IPC regression compared to IceLake. The memory subsystem is unable
> > > > > > to keep up with the higher clocks, and the reworked cache is not enough.
> > > > > >
> > > > >
> > > > > I just want to add on that sentence as that's not what I wrote
> > > > > in the piece: I don't think the memory subsystem is to blame.
> > > > >
> > > > > It's significantly stronger than ICL and showcases *much* better DRAM latency and significant
> > > > > single core bandwidth uplift. 429.mcf showcases great scaling well beyond clocks, showing
> > > > > that latency for example is not to blame. In my opinion it's a regression *because* of the
> > > > > reworked cache, as essentially the L3 is now 20% slower per clock versus ICL.
> > > >
> > > > You mean L3 latency, right? It might be a part of it, but the regression in libquantum
> > > > and lbm are too large to be explained by this few cycle change, I think. You'd pretty much
> > > > have to write a dedicated L3 latency test to get that big of a drop and IIRC neither of
> > > > those are known to be very dependent on L3 latency (they are more bandwidth heavy).
> > > >
> > > > So I think there's something else more interesting going on there.
> > > >
> > > >
> > >
> > > TGL uncore appears to be inspired SKX, except, hopefully, better latency of LLC misses under light load.
> > > So, may be, it suffers from similarly low single-core bandwidth?
> > >
> >
> > Well Andrei has some detailed bandwidth benchmarks on this page and performance looks
> > better across the board: there's actually a significant bump in L3 and RAM regions.
> >
>
> Yes, it's better than ICL.
> But probably quite a lot worse than desktop SKL. Out of memory ( :-) ), my E-2176G achieves 33-35 GB/s on long
> sequential reads, supposedly similar to Andrei's Vec128 LD test. If I am not mistaken, even i7-6920HQ with DDR4-2133
> that I was playing with couple of years ago, was capable to do 30 GB/s. From raw bandwidth perspective LPDDR4X-4266
> in TGL rig should be equal to DDR4-2133, right? But the end result is somehow 1.5x lower.
> I have no idea what "flip" tests do, so can't compare.
>
> > So I feel like it has to be something more complicated than just worse peak BW: maybe a different
> > way of splitting power between core, uncore and memory? Paul Alcorn from Tomshardware suggested
> > that memory frequency itself can be varied on this part, not sure if that's correct. I don't
> > think any previous Intel part had frequency scaling for the memory bus?
> >
>
>

The flip test is a memory copy test that sits inside a fixed memory region, moving cachelines from one end to the other end, essentially flipping the memory region around on a cacheline block basis.

It's basically the same bandwidth as a traditional memory copy just different locality in virtual memory.

---

I did some more characterisations via counters on a 9900K to see where the stress-points are. Essentially the Willow Cove improvements regressions follow this formula:

- If the workload has a high HPKI of loads and store in the L3, but a low MKPI, then the workload sees a large performance improvement due to the much bigger L2 cache, due to it previously having a very high miss %.

xalanc and astar follow this behaviour, with high L3 hits but very high L2 misses.

- If the workload has both a high HPKI and MPKI for L3 loads and stores and there's a large % of misses versus hits, then these workloads correspond to the biggest losers for Willow Cove.

https://pbs.twimg.com/media/EiIBUUHWsAMH5Dl?format=png&name=orig

This is essentially all the red workloads.

- The only exception to the above seem to be workloads that are primarily DRAM latency limited and have extremely high memory stall cycles. MCF and omnetpp correspond to this characterisation and on my 9900K have 55.3% and 61.1% stall cycles.

These workloads seem to have very low MLP and are more pointer-chaser like, and here Tiger Lake's much better DRAM latency is counteracting any slowdowns on the part of the L3.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Tiger Lake performance profileanon2020/09/17 06:10 PM
  Tiger Lake performance profileClipping Coupons2020/09/17 07:22 PM
  Tiger Lake performance profileDoug S2020/09/17 09:36 PM
    Tiger Lake performance profileJose2020/09/18 12:24 AM
      Tiger Lake performance profileAndrei F2020/09/18 02:26 AM
        Tiger Lake performance profileitsmydamnation2020/09/18 02:19 PM
          Tiger Lake performance profileMaynard Handley2020/09/18 04:00 PM
            Tiger Lake performance profileAndrei F2020/09/19 07:29 AM
              Tiger Lake performance profileMaynard Handley2020/09/19 09:34 AM
                Tiger Lake performance profileAndrei F2020/09/19 09:43 AM
                  Tiger Lake performance profileanon2020/09/19 10:08 AM
                    Tiger Lake performance profileAndrei Frumusanu2020/09/19 10:52 AM
                      Tiger Lake performance profileanon2020/09/19 11:50 AM
                        Tiger Lake performance profileAndrei F2020/09/19 12:27 PM
          Tiger Lake performance profile-.-2020/09/19 03:31 PM
        Tiger Lake performance profileJose2020/09/19 01:40 AM
          Tiger Lake performance profileAndrei F2020/09/19 07:25 AM
            Tiger Lake performance profileJose2020/09/23 12:27 AM
    Tiger Lake performance profilejuanrga2020/09/18 01:38 AM
      Tiger Lake performance profileDoug S2020/09/18 08:25 AM
  Tiger Lake performance profileAndrei F2020/09/18 12:04 AM
    Tiger Lake performance profileAnon2020/09/18 02:25 AM
      Tiger Lake performance profileAndrei F2020/09/18 02:31 AM
    Tiger Lake performance profileTravis Downs2020/09/19 07:26 PM
      Tiger Lake performance profileMichael S2020/09/20 09:02 AM
        Tiger Lake performance profileTravis Downs2020/09/20 04:34 PM
          Tiger Lake performance profileMichael S2020/09/21 12:38 AM
            Tiger Lake performance profileAndrei F2020/09/21 05:50 AM
              MKPI ? MPKI ? HPKI ? (NT)Michael S2020/09/21 06:03 AM
                MKPI ? MPKI ? HPKI ?Anon2020/09/21 06:22 AM
                  thank you (NT)Michael S2020/09/21 06:42 AM
                  MKPI ? MPKI ? HPKI ?none2020/09/22 12:12 AM
              SPEC Memory traffic & bandwidthAndrei F2020/09/21 07:35 AM
                SPEC Memory traffic & bandwidthAndrei F2020/09/21 07:36 AM
                  SPEC Memory traffic & bandwidthDavid Kanter2020/09/21 01:31 PM
                What is the meaning of multiple rows in few subtests? (NT)Michael S2020/09/21 07:45 AM
                  What is the meaning of multiple rows in few subtests?Andrei F2020/09/21 07:57 AM
            Poor L1D load bandwidthEric Bron2020/09/21 05:56 AM
              erratumEric Bron2020/09/21 05:59 AM
              Sorry I missread the graphEric Bron2020/09/21 06:14 AM
              Poor main memory load bandwidthMichael S2020/09/21 06:19 AM
            Tiger Lake performance profileTravis Downs2020/09/21 02:51 PM
              Tiger Lake performance profileAndrei F2020/09/22 06:03 AM
    Tiger Lake security fixes possible cause?Kevin G2020/09/22 05:10 AM
      Tiger Lake security fixes possible cause?Travis Downs2020/09/22 06:26 AM
  SuperiorityMichael S2020/09/18 01:58 AM
    SuperiorityAndrei F2020/09/18 02:39 AM
      SuperiorityRobert Müller2020/09/18 02:59 AM
        SuperiorityAndrei F2020/09/18 03:47 AM
          SuperiorityRobert Müller2020/09/18 04:45 AM
            SuperiorityAndrei F2020/09/18 05:17 AM
              SuperiorityTravis Downs2020/09/18 06:21 AM
          Superiorityanon2020/09/18 11:34 AM
      SuperiorityMichael S2020/09/18 05:06 AM
        SuperiorityFoo_2020/09/18 05:17 AM
          SuperiorityMichael S2020/09/18 06:08 AM
      SuperiorityDavid Hess2020/09/18 11:55 AM
    SuperiorityAdrian2020/09/18 04:56 AM
      SuperiorityMichael S2020/09/18 06:51 AM
        SuperiorityAdrian2020/09/18 08:35 AM
          SuperioritythePirate2020/09/19 01:28 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?