Performance portability

Article: Introduction to OpenCL
By: Wainwright (ian.wainwright.delete@this.hpcsweden.se), December 11, 2010 2:44 pm
Room: Moderated Discussions

>For example, I've seen 3x performance slowdown using vectorized OpenCL code which
>assumes a 4-wide work-item, while running on Nvidia chips.

This does not make much sense to me, unless:
In the inital 1-wide implementation, you use say 512 threads per block.
In the 4-wide implementation, you still use 512 threads.
Doing this without changing anything else might very will cause issues with registers being spilled into GPU-RAM as not all registers would fit on-chip --> you get lower performance.

If you don't make each thread the same weight, or at least each thread-block the same weight, you might very well get lower perfromance. But that isn't due to the use of 4-wide work items.

By the way, 4-wide instructions are good for AMD GPUs as that helps the compiler find ILP so that it can use the 5-vliw units, correct? If that is the case, should not 4-wide elements helt the GF104 GPUs find some ILP so that it can easier utilize the 3-warps 2-schedulers architecture?

Wainwright
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
OpenCL article onlineDavid Kanter2010/12/09 01:44 AM
  OpenCL article onlineXN2010/12/09 05:33 AM
    OpenCL article onlineDavid Kanter2010/12/09 12:54 PM
  OpenCL article onlineanon2010/12/09 01:33 PM
    OpenCL article onlineDavid Kanter2010/12/09 01:38 PM
      OpenCL article onlineIan Ameline2010/12/09 02:47 PM
      OpenCL article onlineAnon2010/12/09 07:27 PM
        OpenCL article onlineDavid Kanter2010/12/09 09:58 PM
  Performance portabilityBryan Catanzaro2010/12/10 11:43 AM
    Performance portabilityltcommander.data2010/12/10 06:11 PM
      It is difficult to runtime optimize away the difference between a CPU and GPUMark Roulo2010/12/10 06:50 PM
        It is difficult to runtime optimize away the difference between a CPU and GPUhobold2010/12/11 02:35 AM
          It is difficult to runtime optimize away the difference between a CPU and GPUMark Roulo2010/12/12 12:20 PM
            It is difficult to runtime optimize away the difference between a CPU and GPUhobold2010/12/12 02:31 PM
              It is difficult to runtime optimize away the difference between a CPU and GPUanon2010/12/12 03:24 PM
                It is difficult to runtime optimize away the difference between a CPU and GPUhobold2010/12/13 02:44 AM
        Specially when the language provides almost no hardware abstraction (NT)EduardoS2010/12/11 09:53 AM
    Performance portabilityWainwright2010/12/11 02:44 PM
      Performance portabilityEduardoS2010/12/11 02:57 PM
        Performance portabilityWainwright2010/12/11 03:02 PM
          Performance portabilityEduardoS2010/12/11 07:20 PM
            Performance portabilityWainwright2010/12/12 01:22 AM
      Performance portabilityDavid Kanter2010/12/11 04:53 PM
        Performance portabilityEduardoS2010/12/11 07:23 PM
          Performance portabilityDavid Kanter2010/12/11 08:06 PM
            Performance portabilityWainwright2010/12/12 01:26 AM
            Performance portabilityEduardoS2010/12/12 08:04 AM
  OpenCL article onlineAlan Commike2010/12/14 12:01 PM
  OpenCL - why are there any pointers at all?Rob Thorpe2010/12/16 02:45 AM
    OpenCL - why are there any pointers at all?EduardoS2010/12/16 12:51 PM
      OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 02:19 AM
        OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 06:02 AM
          OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 07:29 AM
            OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 08:13 AM
              OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 09:03 AM
                OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 09:53 AM
                  OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 10:19 AM
                    OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 10:51 AM
                OpenCL - why are there any pointers at all?hobold2010/12/17 10:06 AM
          OpenCL - why are there any pointers at all?EduardoS2010/12/18 06:58 AM
            OpenCL - why are there any pointers at all?anon2010/12/18 09:27 AM
            OpenCL - why are there any pointers at all?BorisG2010/12/18 09:33 AM
              OpenCL - why are there any pointers at all?Richard Cownie2010/12/18 01:39 PM
  OpenCL article onlineEmil Briggs2010/12/19 05:40 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?