Article: Parallelism at HotPar 2010
By: sea (sea.delete@this.sea.com), August 16, 2010 7:55 pm
Room: Moderated Discussions
Let me give one example:
Software name: CUDAMCML
Formal publication:
"Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration"
J. Biomed. Opt., Vol. 13, 060504 (2008); doi:10.1117/1.3041496
Abstract:
... Monte Carlo simulations of photon migration. In a standard simulation of time-resolved photon migration in a semi-infinite geometry, the proposed methodology executed on a low-cost graphics processing unit (GPU) is a factor 1000 faster than simulation performed on a single standard processor. ...
In their later CUDAMCML manual, their gave up the 1000X claim. Now, they claimed that CUDAMCML is about 50X times faster than original MCML, a 15 year old program one CPU core.
Later this year, there is another publication by different researchers: "Tetrahedron-based inhomogeneous Monte-Carlo optical simulator." Phys. Med. Biol. 55:947-962, 2010. In this publication, the two researchers compared CUDAMCML with a slightly improved multi-thread MCML. Now the speedup is only 2 times.
From 1000X to 50X to 2X, what a difference.
There is another paper shows 300X: "Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units."
In this paper, the authors compared a GPU program on 8800GT with CPU program on Intel Xeon 1.86GHz (E5120?). The 8800GT GPU has about 112 cores and the frequency of 8800GT is lower than 1.86GHz. Even if one GPU core is equal to one CPU core, the speed up is at most 112 times. I am very sure his code can be improved 10 times.
Software name: CUDAMCML
Formal publication:
"Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration"
J. Biomed. Opt., Vol. 13, 060504 (2008); doi:10.1117/1.3041496
Abstract:
... Monte Carlo simulations of photon migration. In a standard simulation of time-resolved photon migration in a semi-infinite geometry, the proposed methodology executed on a low-cost graphics processing unit (GPU) is a factor 1000 faster than simulation performed on a single standard processor. ...
In their later CUDAMCML manual, their gave up the 1000X claim. Now, they claimed that CUDAMCML is about 50X times faster than original MCML, a 15 year old program one CPU core.
Later this year, there is another publication by different researchers: "Tetrahedron-based inhomogeneous Monte-Carlo optical simulator." Phys. Med. Biol. 55:947-962, 2010. In this publication, the two researchers compared CUDAMCML with a slightly improved multi-thread MCML. Now the speedup is only 2 times.
From 1000X to 50X to 2X, what a difference.
There is another paper shows 300X: "Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units."
In this paper, the authors compared a GPU program on 8800GT with CPU program on Intel Xeon 1.86GHz (E5120?). The 8800GT GPU has about 112 cores and the frequency of 8800GT is lower than 1.86GHz. Even if one GPU core is equal to one CPU core, the speed up is at most 112 times. I am very sure his code can be improved 10 times.