By: David Kanter (dkanter.delete@this.realworldtech.com), April 22, 2011 11:53 am
Room: Moderated Discussions
Vincent Diepeveen (diep@xs4all.nl) on 4/22/11 wrote:
---------------------------
>David Kanter (dkanter@realworldtech.com) on 4/21/11 wrote:
>---------------------------
>>EduardoS (no@spam.com) on 4/20/11 wrote:
>>---------------------------
>>>Heikki Kultala (hkultala@iki.NOSPAM.fi) on 4/20/11 wrote:
>>>---------------------------
>>>>Texture fetches consume the execution slots of the >instruction word, ALU operations
>>cannot be started at same >cycle.
>>>
>>>No they don't, they are even on different clauses, the only >memory-like operation
>>>that consumes ALU slots is the LDS load/store, starting with >Evergreen, on R700 it was executed on TMUs too.
>>
>>My understanding (http://www.realworldtech.com/page.cfm?ArticleID=RWT121410213827&p=7)
>>is that on Cayman and Cypress, the ALUs are used for address calculations. So you
>>cannot simultaneously execute ALU clauses and initiate a texture fetch. However,
>>initiating a texture fetch is fairly quick - most of the time is spent waiting for
>>data. While you are waiting for data, the ALUs are free for independent computations.
>>
>>David
>
>Texture memory is not the adviced method to get things >done as texture memory is
>not so fast (though faster than main memory).
Well there aren't many other options besides the register file and local data store.
>In the first place you need to split up your software in >wavefronts that realistically
>only calculate within the compute units and don't use any >resources outside of it.
>
>Please note your artice is one of the few on the internet >which describes the Cayman architecture a bit.
Thank you for the compliment!
>The interesting thing to know obviously now is when AMD >has managed to fully improve
>the opencl compiler to support this new architecture >pretty well.
>
>Nvidia also seems to struggle supporting OpenCL well. This >where OpenCL really seems like an interesting thing.
I think NV probably has quite good OpenCL support, since they are basically just re-using a lot of their CUDA work.
>As for AMD gpu's, only opencl will get kept supported by >AMD for their GPU's, so there are not really choices there.
AMD also has CPU support.
>Beforehand OpenCL doesn't really seem like the perfect language yet, as a big droop
>for some will be that the current opencl 1.1 specs give 25% of the RAM as the maximum
>object size, which has the implication you can allocate only 25% for that whereas
>there will be several applications that want from this >tiny amount of RAM of course everything.
What OpenCL restriction are you thinking of? I haven't seen anything like that before (but I wasn't looking either).
>Yet it is a big step forward if you think about it how the entire HPC world will
>be supporting OpenCL and how also the other manufacturers will in the end be forced
>to produce hardware that uses the manycore concept, as this is seemingly (to hardware
>laymen like me) the only concept that will give enough >crunching power at a cheap price in the close future.
GPUs don't have that many cores. They are still at around ~10-20, but they use vectorization and VLIW to achieve a higher number of FLOP/core than a traditional CPU.
Honestly, Tilera has more cores than any GPU.
>Of course the compiler quality will be very important then.
>
>There is a lot to win there, for example having logics in >the compiler to recognize
>whether the programmer is trying to use the actual carry, >caused for example by
>an overflow adding 2 (unsigned) integers.
Compiler quality is always a significant factor!
David
---------------------------
>David Kanter (dkanter@realworldtech.com) on 4/21/11 wrote:
>---------------------------
>>EduardoS (no@spam.com) on 4/20/11 wrote:
>>---------------------------
>>>Heikki Kultala (hkultala@iki.NOSPAM.fi) on 4/20/11 wrote:
>>>---------------------------
>>>>Texture fetches consume the execution slots of the >instruction word, ALU operations
>>cannot be started at same >cycle.
>>>
>>>No they don't, they are even on different clauses, the only >memory-like operation
>>>that consumes ALU slots is the LDS load/store, starting with >Evergreen, on R700 it was executed on TMUs too.
>>
>>My understanding (http://www.realworldtech.com/page.cfm?ArticleID=RWT121410213827&p=7)
>>is that on Cayman and Cypress, the ALUs are used for address calculations. So you
>>cannot simultaneously execute ALU clauses and initiate a texture fetch. However,
>>initiating a texture fetch is fairly quick - most of the time is spent waiting for
>>data. While you are waiting for data, the ALUs are free for independent computations.
>>
>>David
>
>Texture memory is not the adviced method to get things >done as texture memory is
>not so fast (though faster than main memory).
Well there aren't many other options besides the register file and local data store.
>In the first place you need to split up your software in >wavefronts that realistically
>only calculate within the compute units and don't use any >resources outside of it.
>
>Please note your artice is one of the few on the internet >which describes the Cayman architecture a bit.
Thank you for the compliment!
>The interesting thing to know obviously now is when AMD >has managed to fully improve
>the opencl compiler to support this new architecture >pretty well.
>
>Nvidia also seems to struggle supporting OpenCL well. This >where OpenCL really seems like an interesting thing.
I think NV probably has quite good OpenCL support, since they are basically just re-using a lot of their CUDA work.
>As for AMD gpu's, only opencl will get kept supported by >AMD for their GPU's, so there are not really choices there.
AMD also has CPU support.
>Beforehand OpenCL doesn't really seem like the perfect language yet, as a big droop
>for some will be that the current opencl 1.1 specs give 25% of the RAM as the maximum
>object size, which has the implication you can allocate only 25% for that whereas
>there will be several applications that want from this >tiny amount of RAM of course everything.
What OpenCL restriction are you thinking of? I haven't seen anything like that before (but I wasn't looking either).
>Yet it is a big step forward if you think about it how the entire HPC world will
>be supporting OpenCL and how also the other manufacturers will in the end be forced
>to produce hardware that uses the manycore concept, as this is seemingly (to hardware
>laymen like me) the only concept that will give enough >crunching power at a cheap price in the close future.
GPUs don't have that many cores. They are still at around ~10-20, but they use vectorization and VLIW to achieve a higher number of FLOP/core than a traditional CPU.
Honestly, Tilera has more cores than any GPU.
>Of course the compiler quality will be very important then.
>
>There is a lot to win there, for example having logics in >the compiler to recognize
>whether the programmer is trying to use the actual carry, >caused for example by
>an overflow adding 2 (unsigned) integers.
Compiler quality is always a significant factor!
David
Topic | Posted By | Date |
---|---|---|
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 12:55 AM |
Graph is not red-green colorblind friendly (NT) | RatherNotSay | 2011/04/12 04:51 AM |
Fixed | David Kanter | 2011/04/12 09:46 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | James | 2011/04/12 01:30 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 03:51 PM |
Try HD6450 or HD6850 | EduardoS | 2011/04/12 04:31 PM |
Try HD6450 or HD6850 | David Kanter | 2011/04/13 11:25 AM |
Try HD6450 or HD6850 | EduardoS | 2011/04/13 04:20 PM |
of cause | Moritz | 2011/04/14 09:03 AM |
of cause | EduardoS | 2011/04/14 02:55 PM |
Barts = 5D | Moritz | 2011/04/14 10:26 PM |
Barts = 5D | Antti-Ville Tuunainen | 2011/04/15 01:38 AM |
Limiting fixed function units | Moritz | 2011/04/15 05:28 AM |
Limiting fixed function units | Vincent Diepeveen | 2011/04/20 03:38 AM |
lack of detail | Moritz | 2011/04/20 10:24 AM |
lack of detail | EduardoS | 2011/04/20 12:45 PM |
gpgpu | Vincent Diepeveen | 2011/04/16 03:10 AM |
gpgpu | EduardoS | 2011/04/17 01:31 PM |
gpgpu | Groo | 2011/04/17 01:58 PM |
gpgpu | EduardoS | 2011/04/17 02:08 PM |
gpgpu | Ian Ameline | 2011/04/18 04:55 PM |
gpgpu | Ping-Che Chen | 2011/04/19 01:59 AM |
GPU numerical compliance | Sylvain Collange | 2011/04/19 12:38 PM |
GPU numerical compliance | Vincent Diepeveen | 2011/04/20 03:17 AM |
gpgpu | Vincent Diepeveen | 2011/04/20 03:02 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 05:41 AM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/20 06:52 AM |
gpgpu and core counts | none | 2011/04/20 08:05 AM |
gpgpu and core counts | EduardoS | 2011/04/20 12:36 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 11:16 AM |
gpgpu and core counts | EduardoS | 2011/04/20 12:34 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 08:24 PM |
gpgpu and core counts | EduardoS | 2011/04/20 09:55 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/21 07:48 AM |
gpgpu and core counts | EduardoS | 2011/04/22 02:41 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/21 11:42 AM |
AMD Compute and Texture Fetch | Vincent Diepeveen | 2011/04/22 02:14 AM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 11:53 AM |
AMD Compute and Texture Fetch | EduardoS | 2011/04/22 02:46 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 03:02 PM |
AMD Compute and Texture Fetch | EduardoS | 2011/04/22 03:18 PM |
AMD Compute and Texture Fetch | anon | 2011/04/22 04:30 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 10:17 PM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/20 01:12 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/21 11:23 AM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/22 03:11 AM |
Keep the crazy politics out of this | David Kanter | 2011/04/22 09:39 AM |
Keep the crazy politics out of this | Vincent Diepeveen | 2011/04/22 10:12 AM |
Keep the crazy politics out of this | David Kanter | 2011/04/22 11:44 AM |
gpgpu and core counts | Jouni Osmala | 2011/04/22 12:06 PM |
gpgpu | EduardoS | 2011/04/20 12:59 PM |
gpgpu | Vincent Diepeveen | 2011/04/20 01:37 PM |
gpgpu | EduardoS | 2011/04/20 06:27 PM |
gpgpu | Vincent Diepeveen | 2011/04/21 03:06 AM |
gpgpu | EduardoS | 2011/04/22 03:00 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | PiedPiper | 2011/04/12 11:05 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 11:42 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | MS | 2011/04/15 06:04 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Kevin G | 2011/04/16 03:25 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/16 09:42 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Vincent Diepeveen | 2011/04/20 03:20 AM |
memory | Moritz | 2011/04/14 10:03 PM |
memory - more | Moritz | 2011/04/16 12:11 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Kevin G | 2011/04/14 12:30 PM |