By: RichardC (tich.delete@this.pobox.com), January 25, 2017 4:26 am
Room: Moderated Discussions
Aaron Spink (aaronspink.delete@this.notearthlink.net) on January 24, 2017 7:54 pm wrote:
> RichardC (tich.delete@this.pobox.com) on January 24, 2017 5:08 pm wrote:
> > Rendering high-quality 4K video is probably another reasonable app. Each frame might be
> > 3840*2160*3*10bit ~ 29.7MB. You need enough DRAM on each node for the 3D model (games
> > show that you can get fairly complex in 16GB). Pixar averages 3 hours/frame, but some
> > > frames take 8 hours. At that average rate of 3 hours/frame, you need network bandwidth
> > of 29.7MB/(3*3600) = 2884 bytes/sec.
> >
> If we are going to use pixar as an example, it is probably worth pointing out that frame resource requirements
> can run well into the 10s to 100s of GB of data.
So, as I said, 16 or 32GB gets you into the low end of it. And obviously Pixar is doing some of
the most complex video rendering there is.
>And that their parallelization method is separate
> frames. While the average network utilization is low, the peak requirements are incredibly high.
I don't think that makes any sense. For each frame, you send the (relatively small) code/data
to build the model, and eventually you get back the frame. Those transfers can be fully overlapped
with the rendering of the previous frame and the next frame. Maybe they don't choose to do it that
way because the network transfers take so little time even at 1Gbit (about 0.3sec for a frame) that
they're not worth worrying about, but they could. It's a problem with extremely low communication/compute ratio, independent tasks with no dependencies, and no need for low latency.
And googling around, I found a description of Pixar's render farm network from 2010 which mentioned
300 10Gbit ports and 1500 1Gbit ports, which sounds very much like 1Gbit ports for most of the
rendering boxes. Maybe they have some shared data on fileserver boxes which need the 10Gbit ?
Or maybe those are just for the higher-level interconnect between switches. Anyhow, this is the creme de la creme, and it is (or recently was) predominantly 1Gbit.
> Even Beowulf systems these days are using all the network they can. The days of making
> clusters with 1gbe have long since past. 1gbe was barely adequate back in the days
> of PPros and since then the per socket performance has scaled significantly.
Well, a lot of stuff these days happens on clusters in the cloud, which is mostly based on dual-Xeon's
with maybe 20-28 cpu cores, 64-256GB DRAM, and a 10Gbit connection. That gives you a rather low
ratio of network bandwidth / DRAM size, and it often ends up being sliced up as a bunch of
2-core or 4-core VMs each of which then has around 1Gbit/s. So I think you're again ignoring the
fact that low-bandwidth apps can run in a cheap and easy way on a platform that you're not
counting as a "supercomputer" or even a Beowulf.
> RichardC (tich.delete@this.pobox.com) on January 24, 2017 5:08 pm wrote:
> > Rendering high-quality 4K video is probably another reasonable app. Each frame might be
> > 3840*2160*3*10bit ~ 29.7MB. You need enough DRAM on each node for the 3D model (games
> > show that you can get fairly complex in 16GB). Pixar averages 3 hours/frame, but some
> > > frames take 8 hours. At that average rate of 3 hours/frame, you need network bandwidth
> > of 29.7MB/(3*3600) = 2884 bytes/sec.
> >
> If we are going to use pixar as an example, it is probably worth pointing out that frame resource requirements
> can run well into the 10s to 100s of GB of data.
So, as I said, 16 or 32GB gets you into the low end of it. And obviously Pixar is doing some of
the most complex video rendering there is.
>And that their parallelization method is separate
> frames. While the average network utilization is low, the peak requirements are incredibly high.
I don't think that makes any sense. For each frame, you send the (relatively small) code/data
to build the model, and eventually you get back the frame. Those transfers can be fully overlapped
with the rendering of the previous frame and the next frame. Maybe they don't choose to do it that
way because the network transfers take so little time even at 1Gbit (about 0.3sec for a frame) that
they're not worth worrying about, but they could. It's a problem with extremely low communication/compute ratio, independent tasks with no dependencies, and no need for low latency.
And googling around, I found a description of Pixar's render farm network from 2010 which mentioned
300 10Gbit ports and 1500 1Gbit ports, which sounds very much like 1Gbit ports for most of the
rendering boxes. Maybe they have some shared data on fileserver boxes which need the 10Gbit ?
Or maybe those are just for the higher-level interconnect between switches. Anyhow, this is the creme de la creme, and it is (or recently was) predominantly 1Gbit.
> Even Beowulf systems these days are using all the network they can. The days of making
> clusters with 1gbe have long since past. 1gbe was barely adequate back in the days
> of PPros and since then the per socket performance has scaled significantly.
Well, a lot of stuff these days happens on clusters in the cloud, which is mostly based on dual-Xeon's
with maybe 20-28 cpu cores, 64-256GB DRAM, and a 10Gbit connection. That gives you a rather low
ratio of network bandwidth / DRAM size, and it often ends up being sliced up as a bunch of
2-core or 4-core VMs each of which then has around 1Gbit/s. So I think you're again ignoring the
fact that low-bandwidth apps can run in a cheap and easy way on a platform that you're not
counting as a "supercomputer" or even a Beowulf.