Are you kidding?

By: anon (anon.delete@this.anon.com), October 7, 2015 6:21 pm
Room: Moderated Discussions
juanrga (nospam.delete@this.juanrga.com) on October 7, 2015 11:46 am wrote:
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on October 7, 2015 5:49 am wrote:
> > David Kanter (dkanter.delete@this.realworldtech.com) on October 5, 2015 12:24 pm wrote:
> > >
> > > With Skylake, it's a little complex - I can see an argument for 6-wide (assuming
> > > hit in the uop cache) or 5-wide. Both of those are sustainable, although in the
> > > case of I$ hits, I'm not sure there is really enough fetch bandwidth.
> >
> > I'd be very surprised if Skylake is "5-wide decode" in the sense
> > that most people think of it and it seems to be talked about here.
> >
> > In fact, Intel doesn't even say it's 5-wide. Intel says it's "5 uops max". It's possible
> > that that never means "five instructions" - it might be that you only get five uops
> > when one or more of the decoders end up decoding an instruction into multiple uops.
> >
>
> Haswell decoder can decode up to 16 bytes per cycle of complex instructions into up to 4 fused-uops,
> which are supplied down the pipeline. Since fused-uops are split into uops (Haswell can execute
> up to 8 uops per cycle), the decoder peak is higher than 4 when we are counting uops.

You don't measure decoder width in uops though. You wouldn't say Haswell has a 4-wide decoder for complex instructions, for example. It has a single decoder for those which can emit up to 4 uops per cycle. And on the other hand, you could say Haswell has a 5 wide decoder because of instruction fusion, even if those peak cases still can only result in 4 uops per cycle being emitted. So it's not tied to uops one way or the other.

Of course that's all academic, but still important to try keeping to existing terminology as much as possible.

>
> I am not following Skylake details because I am awaiting for a more detailed description of the
> arch. But "5-wide decode" for Skylake sounds that can supply up to 5 fused-uops per cycle.

In the Intel manual they were careful to say that it could emit 5 uops per cycle up from 4. They did not say "5-wide decode" (unless I missed it), which would mean something else.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Update to Intel Optimization ManualSHK2015/09/29 05:38 AM
  gather speedEric Bron2015/09/29 09:43 AM
    gather speedGabriele Svelto2015/09/29 12:00 PM
  Update to Intel Optimization ManualTim McCaffrey2015/09/29 11:18 AM
    Update to Intel Optimization ManualSHK2015/09/29 12:04 PM
      Update to Intel Optimization ManualAnon2015/09/29 02:23 PM
    Update to Intel Optimization Manualnone2015/09/29 10:31 PM
      Update to Intel Optimization ManualMichael S2015/09/30 04:24 AM
    Update to Intel Optimization ManualMichael S2015/09/30 04:30 AM
      Update to Intel Optimization ManualTim McCaffrey2015/09/30 10:01 AM
  5-6 wide core, why no mention from Intel?Wouter Tinus2015/09/30 02:14 PM
    5-6 wide core, why no mention from Intel?Maynard Handley2015/09/30 03:30 PM
      5-6 wide core, why no mention from Intel?Alberto2015/10/01 12:13 AM
        5-6 wide core, why no mention from Intel?anon2015/10/01 02:21 AM
          5-6 wide core, why no mention from Intel?Alberto2015/10/01 04:41 AM
            5-6 wide core, why no mention from Intel?anon2015/10/01 05:27 AM
              5-6 wide core, why no mention from Intel?Alberto2015/10/01 08:33 AM
                5-6 wide core, why no mention from Intel?juanrga2015/10/01 10:24 AM
        5-6 wide core, why no mention from Intel?Maynard Handley2015/10/01 08:57 AM
    5-6 wide core, why no mention from Intel?juanrga2015/10/01 03:59 AM
      5-6 wide core, why no mention from Intel?Wouter Tinus2015/10/01 02:48 PM
        5-6 wide core, why no mention from Intel?juanrga2015/10/03 03:17 AM
          5-6 wide core, why no mention from Intel?Wouter Tinus2015/10/03 11:19 AM
            Are you kidding? (NT)juanrga2015/10/04 05:30 AM
              Are you kidding?Wouter Tinus2015/10/04 03:18 PM
                Are you kidding?juanrga2015/10/05 09:46 AM
                  Are you kidding?David Kanter2015/10/05 11:24 AM
                    Are you kidding?anon2015/10/05 09:26 PM
                    Are you kidding?Linus Torvalds2015/10/07 04:49 AM
                      Are you kidding?juanrga2015/10/07 10:46 AM
                        Are you kidding?anon2015/10/07 06:21 PM
                  Are you kidding?Wouter Tinus2015/10/05 01:25 PM
                    Are you kidding?juanrga2015/10/06 10:17 AM
                      Are you kidding?Stubabe2015/10/07 12:17 AM
                        Are you kidding?juanrga2015/10/07 10:56 AM
                          Amazing...Wouter Tinus2015/10/07 11:31 AM
                            Amazing...juanrga2015/10/07 03:45 PM
                          Are you kidding?Stubabe2015/10/07 11:57 AM
                            Are you kidding?juanrga2015/10/07 03:59 PM
                          Are you kidding?Wilco2015/10/07 02:07 PM
                            Are you kidding?juanrga2015/10/07 04:33 PM
      5-6 wide core, why no mention from Intel?Eric Bron2015/10/04 04:18 AM
    5-6 wide core, why no mention from Intel?David Kanter2015/10/01 09:01 AM
      Optimal number and kind of execution unitsjuanrga2015/10/01 10:50 AM
        Optimal number and kind of execution unitsPatrick Chase2015/10/01 04:38 PM
          Optimal number and kind of execution unitsI.S.T.2015/10/01 05:10 PM
            Optimal number and kind of execution unitsPatrick Chase2015/10/01 11:39 PM
          Optimal number and kind of execution unitsExophase2015/10/01 10:11 PM
          Optimal number and kind of execution unitsjuanrga2015/10/02 05:14 AM
      LD/ST unitsSHK2015/10/01 11:11 AM
        LD/ST unitsDavid Kanter2015/10/01 12:54 PM
          LD/ST unitsSHK2015/10/02 04:55 AM
            LD/ST unitsJukka Larja2015/10/02 09:49 PM
        LD/ST unitsMaynard Handley2015/10/01 01:01 PM
          LD/ST unitsanon2015/10/01 09:54 PM
      5-6 wide core, why no mention from Intel?Maynard Handley2015/10/01 12:57 PM
        5-6 wide core, why no mention from Intel?David Kanter2015/10/01 03:49 PM
          5-6 wide core, why no mention from Intel?Maynard Handley2015/10/01 06:21 PM
          5-6 wide core, why no mention from Intel?Exophase2015/10/01 10:07 PM
            5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 12:10 AM
              5-6 wide core, why no mention from Intel?Megol2015/10/02 03:39 AM
                5-6 wide core, why no mention from Intel?Michael S2015/10/02 04:27 AM
                5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 09:37 AM
                  5-6 wide core, why no mention from Intel?noko2015/10/02 05:19 PM
              5-6 wide core, why no mention from Intel?Exophase2015/10/02 06:43 AM
                5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 09:45 AM
                  5-6 wide core, why no mention from Intel?Exophase2015/10/02 10:23 AM
          5-6 wide core, why no mention from Intel?Wilco2015/10/02 12:48 PM
            5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 01:25 PM
              5-6 wide core, why no mention from Intel?Wilco2015/10/02 02:26 PM
              5-6 wide core, why no mention from Intel?noko2015/10/02 05:45 PM
                5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 06:54 PM
            5-6 wide core, why no mention from Intel?David Kanter2015/10/02 01:59 PM
              5-6 wide core, why no mention from Intel?Wilco2015/10/02 02:59 PM
                5-6 wide core, why no mention from Intel?David Kanter2015/10/02 03:15 PM
                  5-6 wide core, why no mention from Intel?Wilco2015/10/02 04:06 PM
                    LDP/STP usage in AArch64 for 403.gccnone2015/10/03 01:04 AM
                      LDP/STP usage in AArch64 for 403.gccWilco2015/10/03 03:02 AM
                        LDP/STP usage in AArch64 for 403.gccnone2015/10/03 03:11 AM
                          LDP/STP usage in AArch64 for 403.gccWilco2015/10/03 03:37 AM
                            LDP/STP usage in AArch64 for 403.gccnone2015/10/03 04:37 AM
                              LDP/STP usage in AArch64 for 403.gccWilco2015/10/03 05:26 AM
                  5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 04:24 PM
              5-6 wide core, why no mention from Intel?Maynard Handley2015/10/02 03:07 PM
  Update to Intel Optimization Manualanon2015/09/30 04:43 PM
  Update to Intel Optimization ManualPatrick Chase2015/09/30 09:44 PM
    Update to Intel Optimization Manualanon2015/09/30 10:49 PM
    Update to Intel Optimization Manualnone2015/09/30 10:50 PM
    Update to Intel Optimization ManualDavid Kanter2015/10/01 12:52 PM
      Update to Intel Optimization ManualPatrick Chase2015/10/01 04:16 PM
        Update to Intel Optimization Manualanon2015/10/01 10:45 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?