By: Vincent Diepeveen (diep.delete@this.xs4all.nl), April 20, 2011 2:38 am
Room: Moderated Discussions
Moritz (Better@not.tell) on 4/15/11 wrote:
---------------------------
>Ah, then I get it.
>Because of the fewer "shader"-units DK would predict a lower score, but the score
>will not drop as much as predicted because of the better balance between fixed and programable HW-units.
The problem is that David's assumption is totally wrong; for difference generations of gpu's you can't make a lineair function that predicts its performance.
Most games only require big bandwidth; so performance prediction based upon the aggregated bandwidth would predict things more accurate than David's silly attempt; even then it would be off of course.
Comes newer generation hardware, comes faster hardware, as simple as that.
Where AMD and Nvidia shut up to much about is the actual amount of resources they have to execute executions X.
For example a conditional move is an instruction you can see from AMD's instruction set they have in abundance, but how many of the 4 units can execute this?
There is in the 6900 manual not even a diagram showing the 4 execution units, let alone what sort of instructions which unit can execute.
This is something that's very crucial to know for programmers. If you look at agner fog, he has measured some latencies on cpu's; there is nothing like that about gpu's.
Only what NSA type dudes leak to me is what i know there and usually that's focking accurate.
Yet they leak so fragmentaric, that it's not even remotely enough to write really fast code for a gpu right from scratch.
Only trial and error brings that.
Very simple question is for example how fast is a shift-right?
We know it can shiftright at most max 31 bits, that's what the evergreen could do, so we can safely guess that's the max the 6000 series can do as well.
Yet how many units can execute a shiftright instruction?
Hope you realize how simple such questions are and how much time it takes to test each individual instruction there.
Then there is an opencl compiler in between that can mess things up, in case of AMD; yet i pray it'll get slowly better.
Yet closing our eyes for the huge performance gpu's deliver, just based upon bad documentation from their side, is not the correct approach. If you want performance nowadays, gpu's can deliver that, just it takes a bit more effort to get the great performance out of the gpu.
What i really wonder about is the small amount of contract jobs i see where companies/organisations ask for capable gpu programmers.
The fundamental problem of society might be that they're not used to pay much for software development.
Every HPC guy may figure it out himself; which is not a very clever way to do it if we realize how tough gpu's are to program efficiently.
---------------------------
>Ah, then I get it.
>Because of the fewer "shader"-units DK would predict a lower score, but the score
>will not drop as much as predicted because of the better balance between fixed and programable HW-units.
The problem is that David's assumption is totally wrong; for difference generations of gpu's you can't make a lineair function that predicts its performance.
Most games only require big bandwidth; so performance prediction based upon the aggregated bandwidth would predict things more accurate than David's silly attempt; even then it would be off of course.
Comes newer generation hardware, comes faster hardware, as simple as that.
Where AMD and Nvidia shut up to much about is the actual amount of resources they have to execute executions X.
For example a conditional move is an instruction you can see from AMD's instruction set they have in abundance, but how many of the 4 units can execute this?
There is in the 6900 manual not even a diagram showing the 4 execution units, let alone what sort of instructions which unit can execute.
This is something that's very crucial to know for programmers. If you look at agner fog, he has measured some latencies on cpu's; there is nothing like that about gpu's.
Only what NSA type dudes leak to me is what i know there and usually that's focking accurate.
Yet they leak so fragmentaric, that it's not even remotely enough to write really fast code for a gpu right from scratch.
Only trial and error brings that.
Very simple question is for example how fast is a shift-right?
We know it can shiftright at most max 31 bits, that's what the evergreen could do, so we can safely guess that's the max the 6000 series can do as well.
Yet how many units can execute a shiftright instruction?
Hope you realize how simple such questions are and how much time it takes to test each individual instruction there.
Then there is an opencl compiler in between that can mess things up, in case of AMD; yet i pray it'll get slowly better.
Yet closing our eyes for the huge performance gpu's deliver, just based upon bad documentation from their side, is not the correct approach. If you want performance nowadays, gpu's can deliver that, just it takes a bit more effort to get the great performance out of the gpu.
What i really wonder about is the small amount of contract jobs i see where companies/organisations ask for capable gpu programmers.
The fundamental problem of society might be that they're not used to pay much for software development.
Every HPC guy may figure it out himself; which is not a very clever way to do it if we realize how tough gpu's are to program efficiently.
Topic | Posted By | Date |
---|---|---|
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/11 11:55 PM |
Graph is not red-green colorblind friendly (NT) | RatherNotSay | 2011/04/12 03:51 AM |
Fixed | David Kanter | 2011/04/12 08:46 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | James | 2011/04/12 12:30 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 02:51 PM |
Try HD6450 or HD6850 | EduardoS | 2011/04/12 03:31 PM |
Try HD6450 or HD6850 | David Kanter | 2011/04/13 10:25 AM |
Try HD6450 or HD6850 | EduardoS | 2011/04/13 03:20 PM |
of cause | Moritz | 2011/04/14 08:03 AM |
of cause | EduardoS | 2011/04/14 01:55 PM |
Barts = 5D | Moritz | 2011/04/14 09:26 PM |
Barts = 5D | Antti-Ville Tuunainen | 2011/04/15 12:38 AM |
Limiting fixed function units | Moritz | 2011/04/15 04:28 AM |
Limiting fixed function units | Vincent Diepeveen | 2011/04/20 02:38 AM |
lack of detail | Moritz | 2011/04/20 09:24 AM |
lack of detail | EduardoS | 2011/04/20 11:45 AM |
gpgpu | Vincent Diepeveen | 2011/04/16 02:10 AM |
gpgpu | EduardoS | 2011/04/17 12:31 PM |
gpgpu | Groo | 2011/04/17 12:58 PM |
gpgpu | EduardoS | 2011/04/17 01:08 PM |
gpgpu | Ian Ameline | 2011/04/18 03:55 PM |
gpgpu | Ping-Che Chen | 2011/04/19 12:59 AM |
GPU numerical compliance | Sylvain Collange | 2011/04/19 11:38 AM |
GPU numerical compliance | Vincent Diepeveen | 2011/04/20 02:17 AM |
gpgpu | Vincent Diepeveen | 2011/04/20 02:02 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 04:41 AM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/20 05:52 AM |
gpgpu and core counts | none | 2011/04/20 07:05 AM |
gpgpu and core counts | EduardoS | 2011/04/20 11:36 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 10:16 AM |
gpgpu and core counts | EduardoS | 2011/04/20 11:34 AM |
gpgpu and core counts | Heikki Kultala | 2011/04/20 07:24 PM |
gpgpu and core counts | EduardoS | 2011/04/20 08:55 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/21 06:48 AM |
gpgpu and core counts | EduardoS | 2011/04/22 01:41 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/21 10:42 AM |
AMD Compute and Texture Fetch | Vincent Diepeveen | 2011/04/22 01:14 AM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 10:53 AM |
AMD Compute and Texture Fetch | EduardoS | 2011/04/22 01:46 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 02:02 PM |
AMD Compute and Texture Fetch | EduardoS | 2011/04/22 02:18 PM |
AMD Compute and Texture Fetch | anon | 2011/04/22 03:30 PM |
AMD Compute and Texture Fetch | David Kanter | 2011/04/22 09:17 PM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/20 12:12 PM |
gpgpu and core counts | Heikki Kultala | 2011/04/21 10:23 AM |
gpgpu and core counts | Vincent Diepeveen | 2011/04/22 02:11 AM |
Keep the crazy politics out of this | David Kanter | 2011/04/22 08:39 AM |
Keep the crazy politics out of this | Vincent Diepeveen | 2011/04/22 09:12 AM |
Keep the crazy politics out of this | David Kanter | 2011/04/22 10:44 AM |
gpgpu and core counts | Jouni Osmala | 2011/04/22 11:06 AM |
gpgpu | EduardoS | 2011/04/20 11:59 AM |
gpgpu | Vincent Diepeveen | 2011/04/20 12:37 PM |
gpgpu | EduardoS | 2011/04/20 05:27 PM |
gpgpu | Vincent Diepeveen | 2011/04/21 02:06 AM |
gpgpu | EduardoS | 2011/04/22 02:00 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | PiedPiper | 2011/04/12 10:05 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/12 10:42 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | MS | 2011/04/15 05:04 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Kevin G | 2011/04/16 02:25 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | David Kanter | 2011/04/16 08:42 AM |
New Article: Predicting GPU Performance for AMD and Nvidia | Vincent Diepeveen | 2011/04/20 02:20 AM |
memory | Moritz | 2011/04/14 09:03 PM |
memory - more | Moritz | 2011/04/15 11:11 PM |
New Article: Predicting GPU Performance for AMD and Nvidia | Kevin G | 2011/04/14 11:30 AM |