AMD Compute and Texture Fetch

Article: Predicting AMD and Nvidia GPU Performance
By: Vincent Diepeveen (diep.delete@this.xs4all.nl), April 22, 2011 2:14 am
Room: Moderated Discussions
David Kanter (dkanter@realworldtech.com) on 4/21/11 wrote:
---------------------------
>EduardoS (no@spam.com) on 4/20/11 wrote:
>---------------------------
>>Heikki Kultala (hkultala@iki.NOSPAM.fi) on 4/20/11 wrote:
>>---------------------------
>>>Texture fetches consume the execution slots of the >instruction word, ALU operations
>cannot be started at same >cycle.
>>
>>No they don't, they are even on different clauses, the only >memory-like operation
>>that consumes ALU slots is the LDS load/store, starting with >Evergreen, on R700 it was executed on TMUs too.
>
>My understanding (http://www.realworldtech.com/page.cfm?ArticleID=RWT121410213827&p=7)
>is that on Cayman and Cypress, the ALUs are used for address calculations. So you
>cannot simultaneously execute ALU clauses and initiate a texture fetch. However,
>initiating a texture fetch is fairly quick - most of the time is spent waiting for
>data. While you are waiting for data, the ALUs are free for independent computations.
>
>David

Texture memory is not the adviced method to get things done as texture memory is not so fast (though faster than main memory).

In the first place you need to split up your software in wavefronts that realistically only calculate within the compute units and don't use any resources outside of it.

Please note your artice is one of the few on the internet which describes the Cayman architecture a bit.

The interesting thing to know obviously now is when AMD has managed to fully improve the opencl compiler to support this new architecture pretty well.

Nvidia also seems to struggle supporting OpenCL well. This where OpenCL really seems like an interesting thing.

As for AMD gpu's, only opencl will get kept supported by AMD for their GPU's, so there are not really choices there.

Beforehand OpenCL doesn't really seem like the perfect language yet, as a big droop for some will be that the current opencl 1.1 specs give 25% of the RAM as the maximum object size, which has the implication you can allocate only 25% for that whereas there will be several applications that want from this tiny amount of RAM of course everything.

Yet it is a big step forward if you think about it how the entire HPC world will be supporting OpenCL and how also the other manufacturers will in the end be forced to produce hardware that uses the manycore concept, as this is seemingly (to hardware laymen like me) the only concept that will give enough crunching power at a cheap price in the close future.

Of course the compiler quality will be very important then.

There is a lot to win there, for example having logics in the compiler to recognize whether the programmer is trying to use the actual carry, caused for example by an overflow adding 2 (unsigned) integers.

Vincent
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New Article: Predicting GPU Performance for AMD and NvidiaDavid Kanter2011/04/12 12:55 AM
  Graph is not red-green colorblind friendly (NT)RatherNotSay2011/04/12 04:51 AM
    FixedDavid Kanter2011/04/12 09:46 AM
  New Article: Predicting GPU Performance for AMD and NvidiaJames2011/04/12 01:30 PM
    New Article: Predicting GPU Performance for AMD and NvidiaDavid Kanter2011/04/12 03:51 PM
  Try HD6450 or HD6850EduardoS2011/04/12 04:31 PM
    Try HD6450 or HD6850David Kanter2011/04/13 11:25 AM
      Try HD6450 or HD6850EduardoS2011/04/13 04:20 PM
        of causeMoritz2011/04/14 09:03 AM
          of causeEduardoS2011/04/14 02:55 PM
            Barts = 5DMoritz2011/04/14 10:26 PM
              Barts = 5DAntti-Ville Tuunainen2011/04/15 01:38 AM
                Limiting fixed function unitsMoritz2011/04/15 05:28 AM
                  Limiting fixed function unitsVincent Diepeveen2011/04/20 03:38 AM
                    lack of detailMoritz2011/04/20 10:24 AM
                      lack of detailEduardoS2011/04/20 12:45 PM
            gpgpuVincent Diepeveen2011/04/16 03:10 AM
              gpgpuEduardoS2011/04/17 01:31 PM
                gpgpuGroo2011/04/17 01:58 PM
                  gpgpuEduardoS2011/04/17 02:08 PM
                  gpgpuIan Ameline2011/04/18 04:55 PM
                    gpgpuPing-Che Chen2011/04/19 01:59 AM
                      GPU numerical complianceSylvain Collange2011/04/19 12:38 PM
                        GPU numerical complianceVincent Diepeveen2011/04/20 03:17 AM
                gpgpuVincent Diepeveen2011/04/20 03:02 AM
                  gpgpu and core countsHeikki Kultala2011/04/20 05:41 AM
                    gpgpu and core countsVincent Diepeveen2011/04/20 06:52 AM
                      gpgpu and core countsnone2011/04/20 08:05 AM
                        gpgpu and core countsEduardoS2011/04/20 12:36 PM
                      gpgpu and core countsHeikki Kultala2011/04/20 11:16 AM
                        gpgpu and core countsEduardoS2011/04/20 12:34 PM
                          gpgpu and core countsHeikki Kultala2011/04/20 08:24 PM
                            gpgpu and core countsEduardoS2011/04/20 09:55 PM
                              gpgpu and core countsHeikki Kultala2011/04/21 07:48 AM
                                gpgpu and core countsEduardoS2011/04/22 02:41 PM
                              AMD Compute and Texture FetchDavid Kanter2011/04/21 11:42 AM
                                AMD Compute and Texture FetchVincent Diepeveen2011/04/22 02:14 AM
                                  AMD Compute and Texture FetchDavid Kanter2011/04/22 11:53 AM
                                AMD Compute and Texture FetchEduardoS2011/04/22 02:46 PM
                                  AMD Compute and Texture FetchDavid Kanter2011/04/22 03:02 PM
                                    AMD Compute and Texture FetchEduardoS2011/04/22 03:18 PM
                                    AMD Compute and Texture Fetchanon2011/04/22 04:30 PM
                                      AMD Compute and Texture FetchDavid Kanter2011/04/22 10:17 PM
                        gpgpu and core countsVincent Diepeveen2011/04/20 01:12 PM
                          gpgpu and core countsHeikki Kultala2011/04/21 11:23 AM
                            gpgpu and core countsVincent Diepeveen2011/04/22 03:11 AM
                              Keep the crazy politics out of thisDavid Kanter2011/04/22 09:39 AM
                                Keep the crazy politics out of thisVincent Diepeveen2011/04/22 10:12 AM
                                  Keep the crazy politics out of thisDavid Kanter2011/04/22 11:44 AM
                              gpgpu and core countsJouni Osmala2011/04/22 12:06 PM
                  gpgpuEduardoS2011/04/20 12:59 PM
                    gpgpuVincent Diepeveen2011/04/20 01:37 PM
                      gpgpuEduardoS2011/04/20 06:27 PM
                        gpgpuVincent Diepeveen2011/04/21 03:06 AM
                          gpgpuEduardoS2011/04/22 03:00 PM
  New Article: Predicting GPU Performance for AMD and NvidiaPiedPiper2011/04/12 11:05 PM
    New Article: Predicting GPU Performance for AMD and NvidiaDavid Kanter2011/04/12 11:42 PM
      New Article: Predicting GPU Performance for AMD and NvidiaMS2011/04/15 06:04 AM
        New Article: Predicting GPU Performance for AMD and NvidiaKevin G2011/04/16 03:25 AM
          New Article: Predicting GPU Performance for AMD and NvidiaDavid Kanter2011/04/16 09:42 AM
          New Article: Predicting GPU Performance for AMD and NvidiaVincent Diepeveen2011/04/20 03:20 AM
    memoryMoritz2011/04/14 10:03 PM
      memory - moreMoritz2011/04/16 12:11 AM
  New Article: Predicting GPU Performance for AMD and NvidiaKevin G2011/04/14 12:30 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊