Performance portability

Article: Introduction to OpenCL
By: Wainwright (ian.wainwright.delete@this.hpcsweden.se), December 11, 2010 3:44 pm
Room: Moderated Discussions

>For example, I've seen 3x performance slowdown using vectorized OpenCL code which
>assumes a 4-wide work-item, while running on Nvidia chips.

This does not make much sense to me, unless:
In the inital 1-wide implementation, you use say 512 threads per block.
In the 4-wide implementation, you still use 512 threads.
Doing this without changing anything else might very will cause issues with registers being spilled into GPU-RAM as not all registers would fit on-chip --> you get lower performance.

If you don't make each thread the same weight, or at least each thread-block the same weight, you might very well get lower perfromance. But that isn't due to the use of 4-wide work items.

By the way, 4-wide instructions are good for AMD GPUs as that helps the compiler find ILP so that it can use the 5-vliw units, correct? If that is the case, should not 4-wide elements helt the GF104 GPUs find some ILP so that it can easier utilize the 3-warps 2-schedulers architecture?

Wainwright
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
OpenCL article onlineDavid Kanter2010/12/09 02:44 AM
  OpenCL article onlineXN2010/12/09 06:33 AM
    OpenCL article onlineDavid Kanter2010/12/09 01:54 PM
  OpenCL article onlineanon2010/12/09 02:33 PM
    OpenCL article onlineDavid Kanter2010/12/09 02:38 PM
      OpenCL article onlineIan Ameline2010/12/09 03:47 PM
      OpenCL article onlineAnon2010/12/09 08:27 PM
        OpenCL article onlineDavid Kanter2010/12/09 10:58 PM
  Performance portabilityBryan Catanzaro2010/12/10 12:43 PM
    Performance portabilityltcommander.data2010/12/10 07:11 PM
      It is difficult to runtime optimize away the difference between a CPU and GPUMark Roulo2010/12/10 07:50 PM
        It is difficult to runtime optimize away the difference between a CPU and GPUhobold2010/12/11 03:35 AM
          It is difficult to runtime optimize away the difference between a CPU and GPUMark Roulo2010/12/12 01:20 PM
            It is difficult to runtime optimize away the difference between a CPU and GPUhobold2010/12/12 03:31 PM
              It is difficult to runtime optimize away the difference between a CPU and GPUanon2010/12/12 04:24 PM
                It is difficult to runtime optimize away the difference between a CPU and GPUhobold2010/12/13 03:44 AM
        Specially when the language provides almost no hardware abstraction (NT)EduardoS2010/12/11 10:53 AM
    Performance portabilityWainwright2010/12/11 03:44 PM
      Performance portabilityEduardoS2010/12/11 03:57 PM
        Performance portabilityWainwright2010/12/11 04:02 PM
          Performance portabilityEduardoS2010/12/11 08:20 PM
            Performance portabilityWainwright2010/12/12 02:22 AM
      Performance portabilityDavid Kanter2010/12/11 05:53 PM
        Performance portabilityEduardoS2010/12/11 08:23 PM
          Performance portabilityDavid Kanter2010/12/11 09:06 PM
            Performance portabilityWainwright2010/12/12 02:26 AM
            Performance portabilityEduardoS2010/12/12 09:04 AM
  OpenCL article onlineAlan Commike2010/12/14 01:01 PM
  OpenCL - why are there any pointers at all?Rob Thorpe2010/12/16 03:45 AM
    OpenCL - why are there any pointers at all?EduardoS2010/12/16 01:51 PM
      OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 03:19 AM
        OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 07:02 AM
          OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 08:29 AM
            OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 09:13 AM
              OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 10:03 AM
                OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 10:53 AM
                  OpenCL - why are there any pointers at all?Rob Thorpe2010/12/17 11:19 AM
                    OpenCL - why are there any pointers at all?Richard Cownie2010/12/17 11:51 AM
                OpenCL - why are there any pointers at all?hobold2010/12/17 11:06 AM
          OpenCL - why are there any pointers at all?EduardoS2010/12/18 07:58 AM
            OpenCL - why are there any pointers at all?anon2010/12/18 10:27 AM
            OpenCL - why are there any pointers at all?BorisG2010/12/18 10:33 AM
              OpenCL - why are there any pointers at all?Richard Cownie2010/12/18 02:39 PM
  OpenCL article onlineEmil Briggs2010/12/19 06:40 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?