Article: Parallelism at HotPar 2010
By: Ian Ollmann (iano.delete@this.apple.com), August 23, 2010 9:21 pm
Room: Moderated Discussions
Anon (no@thanks.com) on 8/22/10 wrote:
---------------------------
>It does however look to me like Intel is playing the same game people seem incensed
>at NVidia for doing here, unless I have missed something their base C implementation
>is running single threaded on a single core, versus 4/8 SMD units for their OpenCL version..
I don't think that is a fair criticism. The difference here is that Ofer is comparing writing what is more or less "simple scalar code" in C vs. OpenCL C. (We'll ignore for a moment the page or three of OpenCL setup code you also have to write to create various objects around your data.) So, basically for
"the same code" (heh heh) OpenCL offers a big speedup! ... *on the same hardware*. Whereas, the GPU vs. CPU comparisons discussed earlier were single threaded scalar C code on CPU vs. a stream programming language on GPU. *Not the same hardware!* Notice that Ofer controlled for one variable (hardware), whereas the other report did not -- that is, the earlier results could have been caused because they were using the GPU or because they wrote their code in an easily parallelized stream programming language. We can't tell! In actuality, it appears it was due to both factors. Most curiously, it looks like from Ofer's results, the larger contribution appears to have actually been from the stream programming language, not the hardware. (Well established programming languages seem to have left quite a bit on the table!) That is, voices here aren't questioning GPU marketing for comparing GPU's and CPUs. This is expected! They just would like it if they would control for other variables such as amount of optimization attention goes into both sides, choice of programming languages, etc. so as to provide realistic apples to apples comparisons.
Now, I don't want to diminish what the GPU has achieved here. In almost any other context, 5x faster is a WOW and a fantastic achievement for (more or less) general purpose hardware! If I was a GPU manufacturer I'd be happy to settle for 5x faster because I'd feel like I'd have a pretty compelling case. "Great! We accept our competitor's findings. We are indeed 5x faster! Case closed. You can buy our product at the booth by the door on the way out." Not to mention their products per unit are often cheaper, the price/performance question seems to be a no-brainer -- as long as you can run your program in a stream programming environment. There may be a similar story about power consumption.
Frankly, I don't understand what all the fuss is about. The two devices are not directly in competition. It's not like you are going to forgo a CPU or GPU all together. Even cell phones have GPUs these days. You can off load work from processing unit to the other as needed. Using a tool like OpenCL, you can run the same code on both at the same time cooperatively. Their skills are entirely complementary, with some overlap for the most expensive workloads. (Which is good!) Due to the need to run both graphics and non-parallelizeable code on your machine, there probably will always be part capable of running each kind of workload well (a CPU and a GPU) of some flavor on the machine, even if they are not entirely separate devices.
The noise is really about which company is going to get the majority of your purchasing dollars. I kinda doubt most people care how their money gets divvied up in the back room of the computer store. It's just the companies who get the money that do. It seems clear to me that what most people really want is a large, capable vector unit on the machine somewhere capable of a high level of computational bandwidth -- even if we all agree they'd never ask for it by name because they don't know what a vector unit is! In the fullness of time, that vector unit might come to reside on the CPU die, the GPU die, neither die, both dies or maybe everything on the same die. This is what the actual battle is about, since it controls who gets the $$$. However, for the rest of us, it is all a lot of sound and fury signifying nothing.
We pay, either way. (Take that, WS!)
---------------------------
>It does however look to me like Intel is playing the same game people seem incensed
>at NVidia for doing here, unless I have missed something their base C implementation
>is running single threaded on a single core, versus 4/8 SMD units for their OpenCL version..
I don't think that is a fair criticism. The difference here is that Ofer is comparing writing what is more or less "simple scalar code" in C vs. OpenCL C. (We'll ignore for a moment the page or three of OpenCL setup code you also have to write to create various objects around your data.) So, basically for
"the same code" (heh heh) OpenCL offers a big speedup! ... *on the same hardware*. Whereas, the GPU vs. CPU comparisons discussed earlier were single threaded scalar C code on CPU vs. a stream programming language on GPU. *Not the same hardware!* Notice that Ofer controlled for one variable (hardware), whereas the other report did not -- that is, the earlier results could have been caused because they were using the GPU or because they wrote their code in an easily parallelized stream programming language. We can't tell! In actuality, it appears it was due to both factors. Most curiously, it looks like from Ofer's results, the larger contribution appears to have actually been from the stream programming language, not the hardware. (Well established programming languages seem to have left quite a bit on the table!) That is, voices here aren't questioning GPU marketing for comparing GPU's and CPUs. This is expected! They just would like it if they would control for other variables such as amount of optimization attention goes into both sides, choice of programming languages, etc. so as to provide realistic apples to apples comparisons.
Now, I don't want to diminish what the GPU has achieved here. In almost any other context, 5x faster is a WOW and a fantastic achievement for (more or less) general purpose hardware! If I was a GPU manufacturer I'd be happy to settle for 5x faster because I'd feel like I'd have a pretty compelling case. "Great! We accept our competitor's findings. We are indeed 5x faster! Case closed. You can buy our product at the booth by the door on the way out." Not to mention their products per unit are often cheaper, the price/performance question seems to be a no-brainer -- as long as you can run your program in a stream programming environment. There may be a similar story about power consumption.
Frankly, I don't understand what all the fuss is about. The two devices are not directly in competition. It's not like you are going to forgo a CPU or GPU all together. Even cell phones have GPUs these days. You can off load work from processing unit to the other as needed. Using a tool like OpenCL, you can run the same code on both at the same time cooperatively. Their skills are entirely complementary, with some overlap for the most expensive workloads. (Which is good!) Due to the need to run both graphics and non-parallelizeable code on your machine, there probably will always be part capable of running each kind of workload well (a CPU and a GPU) of some flavor on the machine, even if they are not entirely separate devices.
The noise is really about which company is going to get the majority of your purchasing dollars. I kinda doubt most people care how their money gets divvied up in the back room of the computer store. It's just the companies who get the money that do. It seems clear to me that what most people really want is a large, capable vector unit on the machine somewhere capable of a high level of computational bandwidth -- even if we all agree they'd never ask for it by name because they don't know what a vector unit is! In the fullness of time, that vector unit might come to reside on the CPU die, the GPU die, neither die, both dies or maybe everything on the same die. This is what the actual battle is about, since it controls who gets the $$$. However, for the rest of us, it is all a lot of sound and fury signifying nothing.
We pay, either way. (Take that, WS!)