Probable bottleneck

Article: Intel's Haswell CPU Microarchitecture
By: Stubabe (nospam.delete@this.nospam.com), November 25, 2012 3:08 am
Room: Moderated Discussions
Laurent Birtz (seerdecker.delete@this.yahoo.com.au) on November 23, 2012 1:45 pm wrote:
> Do you guys know if the 4 uops frontend/retirement limit is the probable bottleneck
> in Haswell for a typical integer work load, e.g. video encoding?
>
> The way I see it, up to four simple arithmetic operations can be done concurrently (assuming 1 cycle per uop),
> and there can be concurrent load/stores, for a total of 6 uops/cycle processed in the execution units. For
> Nehalem, I used to consider the "mov" operations as mostly free since they could be done in another port while
> the two SSE ALUs were busy. Am I correct in believing that this assumption no longer holds true?
>

There are only 4x Integer ALU for scalar (GPR) code, for SSE/AVX it's still only 3x issue. Also I am not sure if that's 3 SSE ALU or its still split into 2xALU + 1x Shift/Mul/Logical as in current designs - one IDF slide suggested it is still 2xALU another 3x...

> My reasoning is that although Haswell processes the "mov" uops in the renamer, those uops still need
> to be issued from the frontend (up to 4 per cycle), so every "mov" instruction has the potential
> of starving an execution unit of uops to execute. If that analysis is correct, then the "mov" instructions
> should generally be avoided in favor of 3-operands instructions and memory operands.
>
> Thanks for any insight.
>

I would say that there is the possibility of this in current designs where you are Fetch bandwidth limited and your code does not use the uop cache (or it's not present), not to mention the reduction of 1 clock latency. In some of my own code 3 operand Integer AVX is faster than SSE2/3/4 with MOVs.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Haswell CPU article onlineDavid Kanter11/13/12 03:43 PM
  Haswell CPU article onlineEric11/13/12 04:10 PM
    Haswell CPU article onlinehobold11/13/12 05:13 PM
      Haswell CPU article onlineRicardo B11/13/12 06:09 PM
    Haswell CPU article onlineanonymou511/13/12 05:44 PM
      Haswell CPU article onlinenone11/14/12 03:40 AM
  Haswell CPU article onlinetarlinian11/13/12 04:56 PM
    Fixed (NT)David Kanter11/13/12 06:06 PM
      Haswell CPU article onlineJacob Marley11/14/12 02:18 AM
  Haswell CPU article onlinerandomshinichi11/14/12 02:53 AM
    LLC == Last Level Cache (usually L3) (NT)Paul A. Clayton11/14/12 05:50 AM
    Haswell CPU article onlineJoe11/14/12 10:38 AM
      LLC vs. L3 vs. L4David Kanter11/14/12 11:09 AM
        LLC vs. L3 vs. L4; LLC = Link Layer ControllerRay11/14/12 10:08 PM
          A pit there are only 17000 TLAs... (NT)EduardoS11/15/12 03:14 AM
  Haswell CPU article onlineanon11/14/12 05:10 AM
    Move elimination can be a µop fusionPaul A. Clayton11/14/12 06:41 AM
      That should be "mov R10 <- R9"! (NT)Paul A. Clayton11/14/12 06:43 AM
      Move elimination can be a µop fusionanon11/14/12 07:25 AM
        It does avoid the scheduler (NT)Paul A. Clayton11/14/12 08:47 AM
      Move elimination can be a µop fusionStubabe11/14/12 01:43 PM
        Move elimination can be a µop fusionanon11/14/12 09:33 PM
          Move elimination can be a µop fusionFelid11/15/12 12:49 AM
            Move elimination can be a µop fusionanon11/15/12 01:23 AM
              Move elimination can be a µop fusionStuart11/15/12 05:04 AM
                Move elimination can be a µop fusionStubabe11/15/12 05:14 AM
                  Move elimination can be a µop fusionanon11/15/12 05:48 AM
                    Move elimination can be a µop fusionEduardoS11/15/12 06:00 AM
                      Move elimination can be a µop fusionanon11/15/12 06:14 AM
                        Move elimination can be a µop fusionEduardoS11/15/12 06:21 AM
                          Move elimination can be a µop fusionanon11/15/12 06:31 AM
                    Move elimination can be a µop fusionStubabe11/15/12 11:38 AM
                      There can be only one dependencePaul A. Clayton11/15/12 12:50 PM
                    Move elimination can be a µop fusionFelid11/15/12 03:19 PM
                      Move elimination can be a µop fusionanon11/16/12 04:07 AM
                        Move elimination can be a µop fusionFelid11/16/12 07:43 PM
                  Move elimination can be a µop fusionFelid11/15/12 02:50 PM
                    Move elimination can be a µop fusionFelid11/15/12 03:03 PM
                      Correction!Felid11/19/12 01:23 AM
                    Thanks, I wasn't aware of the change in SB. Good to know... (NT)Stubabe11/15/12 03:43 PM
            Move fusion assumes adjacencyPaul A. Clayton11/15/12 07:15 AM
              Move fusion assumes adjacencyFelid11/15/12 02:40 PM
        Move elimination can be a µop fusionPatrick Chase11/21/12 11:52 AM
          Move elimination can be a µop fusionPatrick Chase11/21/12 12:12 PM
    Haswell CPU article onlineRicardo B11/14/12 09:12 AM
  Haswell CPU article onlinegmb11/14/12 08:28 AM
  Haswell CPU article onlineFelid11/14/12 11:58 PM
    Haswell CPU article onlineDavid Kanter11/15/12 09:59 AM
      Haswell CPU article onlineFelid11/15/12 02:15 PM
        Instruction queueDavid Kanter11/16/12 12:23 PM
          Instruction queueFelid11/16/12 01:05 PM
  128-bit division unit?Eric Bron11/16/12 04:57 AM
    128-bit division unit?David Kanter11/16/12 08:59 AM
      128-bit division unit?Eric Bron11/16/12 09:47 AM
        128-bit division unit?Felid11/16/12 12:46 PM
          128-bit division unit?Eric Bron11/16/12 01:24 PM
            128-bit division unit?Felid11/16/12 07:19 PM
              128-bit division unit?Eric Bron11/18/12 08:41 AM
            128-bit division unit?Michael S11/17/12 12:50 PM
              128-bit division unit?Felid11/17/12 01:44 PM
                128-bit division unit?Michael S11/17/12 02:45 PM
                  128-bit division unit?Felid11/17/12 05:49 PM
                    128-bit division unit?Michael S11/17/12 06:56 PM
              128-bit division unit?Eric Bron11/18/12 08:35 AM
  Haswell CPU article onlineJim F11/18/12 09:45 AM
    Haswell CPU article onlineGabriele Svelto11/18/12 12:52 PM
  Probable bottleneckLaurent Birtz11/23/12 01:45 PM
    Probable bottleneckEduardoS11/23/12 01:58 PM
      Probable bottleneckLaurent Birtz11/24/12 10:10 AM
    Probable bottleneckStubabe11/25/12 03:08 AM
      Probable bottleneckEduardoS11/25/12 08:15 AM
        Probable bottleneckStubabe11/28/12 04:36 PM
          Urgh. Post got mangled by LESS THAN signStubabe11/28/12 04:41 PM
          Probable bottleneckLaurent Birtz11/29/12 08:34 AM
  Haswell CPU article onlineMr. Camel11/28/12 03:47 PM
    Haswell CPU article onlineEduardoS11/28/12 04:06 PM
      Haswell CPU article onlineMr. Camel11/28/12 07:23 PM
        Haswell CPU article onlineEduardoS11/28/12 07:27 PM
          Haswell CPU article onlineMr. Camel12/12/12 01:39 PM
            Much faster iGPU clock ...Mark Roulo12/12/12 03:53 PM
              Much faster iGPU clock ...Exophase12/12/12 11:46 PM
                Much faster iGPU clock ... or not :-)Mark Roulo12/13/12 09:11 AM
                  Much faster iGPU clock ... or not :-)EduardoS12/13/12 10:38 PM
                    Much faster iGPU clock ... or not :-)Michael S12/14/12 05:33 AM
                      Much faster iGPU clock ... or not :-)EduardoS12/14/12 07:06 AM
                        Much faster iGPU clock ... or not :-)Doug S12/14/12 12:13 PM
                          Much faster iGPU clock ... or not :-)EduardoS12/14/12 12:43 PM
                  Much faster iGPU clock ... or not :-)Mr. Camel12/14/12 10:50 AM
              Much faster iGPU clock ...Michael S12/13/12 02:44 AM
                Much faster iGPU clock ...Mark Roulo12/13/12 09:09 AM
  Haswell CPU article onlineYang12/09/12 08:28 PM
    possible spam bot? (NT)I.S.T.12/10/12 03:40 PM
  CPU Crystal Well behavior w/ eGPU?Robert Williams04/17/13 02:16 PM
    CPU Crystal Well behavior w/ eGPU?Nicolas Capens04/17/13 03:30 PM
      CPU Crystal Well behavior w/ eGPU?RecessionCone04/17/13 04:20 PM
        CPU Crystal Well behavior w/ eGPU?Robert Williams04/17/13 07:37 PM
    CPU Crystal Well behavior w/ eGPU?Eric Bron04/17/13 09:10 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?