Alternatives Implementations

By: Travis Downs (, July 13, 2020 8:41 pm
Room: Moderated Discussions
Kyle Siefring ( on July 13, 2020 6:02 pm wrote:
> We all have got to share the expense for those masks. My guess is that AVX-512 will become cheaper
> as time goes on. That being said, maybe there should be a little less sharing. This is already
> happening with AMD being competitive. You could make the case for gpus being part of this.
> It would be interesting to see a cpu with low latencies for the 128-bit path with
> higher latencies for 256-bit and 512-bit. Same throughput, different latencies.
> On a smaller core, this could be like knights landing with fewer threads.
> For reference.
> format: reciprocal throughput/latency
> knightslanding addps .5/6 mulps .5/6
> haswell addps 1/3 mulps .5/5
> broadwell addps 1/3 mulps .5/3
> skylake addps .5/4 mulps .5/4
> You can see that skylake regressed latencies compared to broadwell. Intel
> clearly didn't do this for giggles. These latencies aren't free.

Well you left out FMA which went from 5 to 4, so I'd say it's a wash: now all the latencies for the core FP ops are 4, rather than a 3|5 split. If you had to pick, FMA latency is probably more important thatn add + mul latency, as most code optimized enough to care about 1 cycle is probably using FMAs.

There are a few cases where larger vectors have longer latency: load latency is 1 cycle higher for ymm and zmm vs xmm (mostly reflects the layout of the vector lanes: the first 128-bit lane is closer to the load/store ports than the second). 512-bit FP ops that go to port5 take an extra cycle (again, due to EU positioning, I believe). 128 or 256-bit FP ops never go to port5, so never get this extra latency.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Skylake-SP area breakdownDavid Kanter2020/07/12 06:13 PM
  Skylake-SP area breakdownanon22020/07/12 07:01 PM
    Skylake-SP area breakdownTravis Downs2020/07/12 08:02 PM
      Skylake-SP area breakdownanon2020/07/12 08:44 PM
  Skylake-SP area breakdownTravis Downs2020/07/12 08:03 PM
    Skylake-SP area breakdownDavid Kanter2020/07/12 08:20 PM
      To elaborateDavid Kanter2020/07/12 08:22 PM
        To elaborateTravis Downs2020/07/13 07:03 AM
          To elaborateAnon2020/07/13 07:36 AM
            To elaborateAdrian2020/07/13 01:45 PM
              To elaborateAnon2020/07/13 02:06 PM
                To elaborateChester2020/07/13 08:30 PM
  Alternatives ImplementationsKyle Siefring2020/07/13 06:02 PM
    Alternatives ImplementationsTravis Downs2020/07/13 08:41 PM
    Alternatives ImplementationsMaynard Handley2020/07/13 10:37 PM
      Alternatives ImplementationsDoug S2020/07/13 11:25 PM
        Mask costsDavid Kanter2020/07/14 08:13 AM
        Alternatives Implementationstarlinian2020/07/14 08:22 AM
          Alternatives ImplementationsDoug S2020/07/14 10:03 AM
          Alternatives ImplementationsMaynard Handley2020/07/14 10:12 AM
        Alternatives ImplementationsMaynard Handley2020/07/14 10:10 AM
          Alternatives ImplementationsDoug S2020/07/14 10:47 AM
            Alternatives ImplementationsBrett2020/07/14 01:38 PM
            Alternatives Implementationstarlinian2020/07/14 02:30 PM
Reply to this Topic
Body: No Text
How do you spell avocado?