Alternatives Implementations

By: Maynard Handley (, July 13, 2020 10:37 pm
Room: Moderated Discussions
Kyle Siefring ( on July 13, 2020 6:02 pm wrote:
> David Kanter ( on July 12, 2020 6:13 pm wrote:
> > I did some analysis a while back that might useful to share here.
> >
> > 8.0 mm2 SKL core
> > 0.9 mm2 AVX512
> > 2.0 mm2 1MB L2$
> > 2.4 mm2 1.375MB L3$
> > 0.4 mm2 Snoop filter
> > 2.0 mm2 caching and home agent
> > 2.2 mm2 FIVR, PLLs
> > 0.4 mm2 Mystery block
> >
> > 18.0 mm2 Total SKL-SP tile
> >
> > So AVX512 is about 5% of the tile area, the tiles are 72% of the total area of SKL-SP.
> >
> > If you removed AVX512 you'd save 28mm2 for the whole chip, which would let you add at most 2 tiles.
> >
> > In that vein, it seems like a pretty reasonable trade-off.
> >
> > David
> This mask is your mask, this mask is my mask, from databases to dsp, this mask belongs to you and me...
> We all have got to share the expense for those masks. My guess is that AVX-512 will become cheaper
> as time goes on. That being said, maybe there should be a little less sharing. This is already
> happening with AMD being competitive. You could make the case for gpus being part of this.

I used to think this (mask costs) was a big deal. I’m no longer convinced.
A 5nm mask set supposedly costs $15M. For comparison Apple sells ~20M Macs per year.
In other words you don’t need outrageous volumes before the mask set costs are lost in the noise.
There may be other costs (eg design, verification) that are much higher for each subsequent substantially different design. But blaming masks is, I think, premature.

Corrections welcome if someone disagrees with the numbers.

> It would be interesting to see a cpu with low latencies for the 128-bit path with
> higher latencies for 256-bit and 512-bit. Same throughput, different latencies.
> On a smaller core, this could be like knights landing with fewer threads.
> For reference.
> format: reciprocal throughput/latency
> knightslanding addps .5/6 mulps .5/6
> haswell addps 1/3 mulps .5/5
> broadwell addps 1/3 mulps .5/3
> skylake addps .5/4 mulps .5/4
> You can see that skylake regressed latencies compared to broadwell. Intel
> clearly didn't do this for giggles. These latencies aren't free.
> I work with video codecs. Higher latency would hurt for smaller vectors, but if you
> are working with larger vectors you tend to have more rows to work with. Latency might
> hurt a bit with maddwd and maddubs since they are used for horizontal adds and subtracts.
> Personally, I think intel should add drop in replacements for those.
> I have no idea about how much actually supporting wide register cost
> and whether that cost can be reduced with a slower implementation.
> As things slow down, another option could be to alternate what each generation is good
> at. I don't think this is possible for now with the constant waves of vulnerabilities.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Skylake-SP area breakdownDavid Kanter2020/07/12 06:13 PM
  Skylake-SP area breakdownanon22020/07/12 07:01 PM
    Skylake-SP area breakdownTravis Downs2020/07/12 08:02 PM
      Skylake-SP area breakdownanon2020/07/12 08:44 PM
  Skylake-SP area breakdownTravis Downs2020/07/12 08:03 PM
    Skylake-SP area breakdownDavid Kanter2020/07/12 08:20 PM
      To elaborateDavid Kanter2020/07/12 08:22 PM
        To elaborateTravis Downs2020/07/13 07:03 AM
          To elaborateAnon2020/07/13 07:36 AM
            To elaborateAdrian2020/07/13 01:45 PM
              To elaborateAnon2020/07/13 02:06 PM
                To elaborateChester2020/07/13 08:30 PM
  Alternatives ImplementationsKyle Siefring2020/07/13 06:02 PM
    Alternatives ImplementationsTravis Downs2020/07/13 08:41 PM
    Alternatives ImplementationsMaynard Handley2020/07/13 10:37 PM
      Alternatives ImplementationsDoug S2020/07/13 11:25 PM
        Mask costsDavid Kanter2020/07/14 08:13 AM
        Alternatives Implementationstarlinian2020/07/14 08:22 AM
          Alternatives ImplementationsDoug S2020/07/14 10:03 AM
          Alternatives ImplementationsMaynard Handley2020/07/14 10:12 AM
        Alternatives ImplementationsMaynard Handley2020/07/14 10:10 AM
          Alternatives ImplementationsDoug S2020/07/14 10:47 AM
            Alternatives ImplementationsBrett2020/07/14 01:38 PM
            Alternatives Implementationstarlinian2020/07/14 02:30 PM
Reply to this Topic
Body: No Text
How do you spell avocado?