By: David Kanter (dkanter.delete@this.realworldtech.com), April 22, 2011 10:53 am
Room: Moderated Discussions
Vincent Diepeveen (diep@xs4all.nl) on 4/22/11 wrote:
---------------------------
>David Kanter (dkanter@realworldtech.com) on 4/21/11 wrote:
>---------------------------
>>EduardoS (no@spam.com) on 4/20/11 wrote:
>>---------------------------
>>>Heikki Kultala (hkultala@iki.NOSPAM.fi) on 4/20/11 wrote:
>>>---------------------------
>>>>Texture fetches consume the execution slots of the >instruction word, ALU operations
>>cannot be started at same >cycle.
>>>
>>>No they don't, they are even on different clauses, the only >memory-like operation
>>>that consumes ALU slots is the LDS load/store, starting with >Evergreen, on R700 it was executed on TMUs too.
>>
>>My understanding (http://www.realworldtech.com/page.cfm?ArticleID=RWT121410213827&p=7)
>>is that on Cayman and Cypress, the ALUs are used for address calculations. So you
>>cannot simultaneously execute ALU clauses and initiate a texture fetch. However,
>>initiating a texture fetch is fairly quick - most of the time is spent waiting for
>>data. While you are waiting for data, the ALUs are free for independent computations.
>>
>>David
>
>Texture memory is not the adviced method to get things >done as texture memory is
>not so fast (though faster than main memory).
Well there aren't many other options besides the register file and local data store.
>In the first place you need to split up your software in >wavefronts that realistically
>only calculate within the compute units and don't use any >resources outside of it.
>
>Please note your artice is one of the few on the internet >which describes the Cayman architecture a bit.
Thank you for the compliment!
>The interesting thing to know obviously now is when AMD >has managed to fully improve
>the opencl compiler to support this new architecture >pretty well.
>
>Nvidia also seems to struggle supporting OpenCL well. This >where OpenCL really seems like an interesting thing.
I think NV probably has quite good OpenCL support, since they are basically just re-using a lot of their CUDA work.
>As for AMD gpu's, only opencl will get kept supported by >AMD for their GPU's, so there are not really choices there.
AMD also has CPU support.
>Beforehand OpenCL doesn't really seem like the perfect language yet, as a big droop
>for some will be that the current opencl 1.1 specs give 25% of the RAM as the maximum
>object size, which has the implication you can allocate only 25% for that whereas
>there will be several applications that want from this >tiny amount of RAM of course everything.
What OpenCL restriction are you thinking of? I haven't seen anything like that before (but I wasn't looking either).
>Yet it is a big step forward if you think about it how the entire HPC world will
>be supporting OpenCL and how also the other manufacturers will in the end be forced
>to produce hardware that uses the manycore concept, as this is seemingly (to hardware
>laymen like me) the only concept that will give enough >crunching power at a cheap price in the close future.
GPUs don't have that many cores. They are still at around ~10-20, but they use vectorization and VLIW to achieve a higher number of FLOP/core than a traditional CPU.
Honestly, Tilera has more cores than any GPU.
>Of course the compiler quality will be very important then.
>
>There is a lot to win there, for example having logics in >the compiler to recognize
>whether the programmer is trying to use the actual carry, >caused for example by
>an overflow adding 2 (unsigned) integers.
Compiler quality is always a significant factor!
David
---------------------------
>David Kanter (dkanter@realworldtech.com) on 4/21/11 wrote:
>---------------------------
>>EduardoS (no@spam.com) on 4/20/11 wrote:
>>---------------------------
>>>Heikki Kultala (hkultala@iki.NOSPAM.fi) on 4/20/11 wrote:
>>>---------------------------
>>>>Texture fetches consume the execution slots of the >instruction word, ALU operations
>>cannot be started at same >cycle.
>>>
>>>No they don't, they are even on different clauses, the only >memory-like operation
>>>that consumes ALU slots is the LDS load/store, starting with >Evergreen, on R700 it was executed on TMUs too.
>>
>>My understanding (http://www.realworldtech.com/page.cfm?ArticleID=RWT121410213827&p=7)
>>is that on Cayman and Cypress, the ALUs are used for address calculations. So you
>>cannot simultaneously execute ALU clauses and initiate a texture fetch. However,
>>initiating a texture fetch is fairly quick - most of the time is spent waiting for
>>data. While you are waiting for data, the ALUs are free for independent computations.
>>
>>David
>
>Texture memory is not the adviced method to get things >done as texture memory is
>not so fast (though faster than main memory).
Well there aren't many other options besides the register file and local data store.
>In the first place you need to split up your software in >wavefronts that realistically
>only calculate within the compute units and don't use any >resources outside of it.
>
>Please note your artice is one of the few on the internet >which describes the Cayman architecture a bit.
Thank you for the compliment!
>The interesting thing to know obviously now is when AMD >has managed to fully improve
>the opencl compiler to support this new architecture >pretty well.
>
>Nvidia also seems to struggle supporting OpenCL well. This >where OpenCL really seems like an interesting thing.
I think NV probably has quite good OpenCL support, since they are basically just re-using a lot of their CUDA work.
>As for AMD gpu's, only opencl will get kept supported by >AMD for their GPU's, so there are not really choices there.
AMD also has CPU support.
>Beforehand OpenCL doesn't really seem like the perfect language yet, as a big droop
>for some will be that the current opencl 1.1 specs give 25% of the RAM as the maximum
>object size, which has the implication you can allocate only 25% for that whereas
>there will be several applications that want from this >tiny amount of RAM of course everything.
What OpenCL restriction are you thinking of? I haven't seen anything like that before (but I wasn't looking either).
>Yet it is a big step forward if you think about it how the entire HPC world will
>be supporting OpenCL and how also the other manufacturers will in the end be forced
>to produce hardware that uses the manycore concept, as this is seemingly (to hardware
>laymen like me) the only concept that will give enough >crunching power at a cheap price in the close future.
GPUs don't have that many cores. They are still at around ~10-20, but they use vectorization and VLIW to achieve a higher number of FLOP/core than a traditional CPU.
Honestly, Tilera has more cores than any GPU.
>Of course the compiler quality will be very important then.
>
>There is a lot to win there, for example having logics in >the compiler to recognize
>whether the programmer is trying to use the actual carry, >caused for example by
>an overflow adding 2 (unsigned) integers.
Compiler quality is always a significant factor!
David
Topic | Posted By | Date |
---|---|---|
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/11 11:55 PM |
Graph is not red-green colorblind friendly (NT) | RatherNotSay | 2011/04/12 03:51 AM |
Fixed | David Kanter | 2011/04/12 08:46 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | James | 2011/04/12 12:30 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 02:51 PM |
Try HD6450 or HD6850 | EduardoS | 2011/04/12 03:31 PM |
Try HD6450 or HD6850 | David Kanter | 2011/04/13 10:25 AM |
Try HD6450 or HD6850 | EduardoS | 2011/04/13 03:20 PM |
of cause | Moritz | 2011/04/14 08:03 AM |
of cause | EduardoS | 2011/04/14 01:55 PM |
Barts = 5D | Moritz | 2011/04/14 09:26 PM |
Barts = 5D | Antti-Ville Tuunainen | 2011/04/15 12:38 AM |
Limiting fixed function units | Moritz | 2011/04/15 04:28 AM |
Limiting fixed function units | Vincent Diepeveen | 2011/04/20 02:38 AM |
lack of detail | Moritz | 2011/04/20 09:24 AM |
lack of detail | EduardoS | 2011/04/20 11:45 AM |
gpgpu | Vincent Diepeveen | 2011/04/16 02:10 AM |
gpgpu | EduardoS | 2011/04/17 12:31 PM |
gpgpu | Groo | 2011/04/17 12:58 PM |
gpgpu | EduardoS | 2011/04/17 01:08 PM |
gpgpu | Ian Ameline | 2011/04/18 03:55 PM |
gpgpu | Ping-Che Chen | 2011/04/19 12:59 AM |
GPU numerical compliance | Sylvain Collange | 2011/04/19 11:38 AM |
GPU numerical compliance | Vincent Diepeveen | 2011/04/20 02:17 AM |
gpgpu | Vincent Diepeveen | 2011/04/20 02:02 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 04:41 AM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/20 05:52 AM |
gpgpu and core counts | none | 2011/04/20 07:05 AM |
gpgpu and core counts | EduardoS | 2011/04/20 11:36 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 10:16 AM |
gpgpu and core counts | EduardoS | 2011/04/20 11:34 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 07:24 PM |
gpgpu and core counts | EduardoS | 2011/04/20 08:55 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/21 06:48 AM |
gpgpu and core counts | EduardoS | 2011/04/22 01:41 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/21 10:42 AM |
AMD Compute and Texture Fetch | Vincent Diepeveen | 2011/04/22 01:14 AM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 10:53 AM |
AMD Compute and Texture Fetch | EduardoS | 2011/04/22 01:46 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 02:02 PM |
AMD Compute and Texture Fetch | EduardoS | 2011/04/22 02:18 PM |
AMD Compute and Texture Fetch | anon | 2011/04/22 03:30 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 09:17 PM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/20 12:12 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/21 10:23 AM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/22 02:11 AM |
Keep the crazy politics out of this | David Kanter | 2011/04/22 08:39 AM |
Keep the crazy politics out of this | Vincent Diepeveen | 2011/04/22 09:12 AM |
Keep the crazy politics out of this | David Kanter | 2011/04/22 10:44 AM |
gpgpu and core counts | Jouni Osmala | 2011/04/22 11:06 AM |
gpgpu | EduardoS | 2011/04/20 11:59 AM |
gpgpu | Vincent Diepeveen | 2011/04/20 12:37 PM |
gpgpu | EduardoS | 2011/04/20 05:27 PM |
gpgpu | Vincent Diepeveen | 2011/04/21 02:06 AM |
gpgpu | EduardoS | 2011/04/22 02:00 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | PiedPiper | 2011/04/12 10:05 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 10:42 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | MS | 2011/04/15 05:04 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Kevin G | 2011/04/16 02:25 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/16 08:42 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Vincent Diepeveen | 2011/04/20 02:20 AM |
memory | Moritz | 2011/04/14 09:03 PM |
memory - more | Moritz | 2011/04/15 11:11 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | Kevin G | 2011/04/14 11:30 AM |