By: RichardC (tich.delete@this.pobox.com), January 24, 2017 6:08 pm
Room: Moderated Discussions
Aaron Spink (aaronspink.delete@this.notearthlink.net) on January 24, 2017 11:01 am wrote:
> For something like CFD, your comm requirements go up as you go to smaller work per node because you need
> to exchange more boundries more often. The reality is that most supers are already network limited for a
> large part of the application space. Linpack tends to be pretty much an absolute best case because it communications
> needs are lower than almost all applications.
Rendering high-quality 4K video is probably another reasonable app. Each frame might be
3840*2160*3*10bit ~ 29.7MB. You need enough DRAM on each node for the 3D model (games
show that you can get fairly complex in 16GB). Pixar averages 3 hours/frame, but some
frames take > 8 hours. At that average rate of 3 hours/frame, you need network bandwidth
of 29.7MB/(3*3600) = 2884 bytes/sec.
That doesn't count as a supercomputer app because it runs on a cluster of workstation-class
machines. And I think that's part of what's going: you're thinking about the apps that run
on supercomputer-class systems. And then you're saying because *those* apps need a lot of
interconnect bandwidth, it's useless to build a system without a fast/expensive interconnect.
But that's back to front: there are other apps that run on cluster-of-workstation systems
(e.g. Beowulf clusters), and don't run on "supercomputers" precisely because the "supercomputer" system is expensive overkill for the app. But if you can build a
flock-of-chickens or flock-of-turkeys system that actually gives more throughput-per-$
and throughput-per-watt than a Beowulf/cluster-of-workstations, then it can be quite useful,
even if it isn't a "supercomputer".
> For something like CFD, your comm requirements go up as you go to smaller work per node because you need
> to exchange more boundries more often. The reality is that most supers are already network limited for a
> large part of the application space. Linpack tends to be pretty much an absolute best case because it communications
> needs are lower than almost all applications.
Rendering high-quality 4K video is probably another reasonable app. Each frame might be
3840*2160*3*10bit ~ 29.7MB. You need enough DRAM on each node for the 3D model (games
show that you can get fairly complex in 16GB). Pixar averages 3 hours/frame, but some
frames take > 8 hours. At that average rate of 3 hours/frame, you need network bandwidth
of 29.7MB/(3*3600) = 2884 bytes/sec.
That doesn't count as a supercomputer app because it runs on a cluster of workstation-class
machines. And I think that's part of what's going: you're thinking about the apps that run
on supercomputer-class systems. And then you're saying because *those* apps need a lot of
interconnect bandwidth, it's useless to build a system without a fast/expensive interconnect.
But that's back to front: there are other apps that run on cluster-of-workstation systems
(e.g. Beowulf clusters), and don't run on "supercomputers" precisely because the "supercomputer" system is expensive overkill for the app. But if you can build a
flock-of-chickens or flock-of-turkeys system that actually gives more throughput-per-$
and throughput-per-watt than a Beowulf/cluster-of-workstations, then it can be quite useful,
even if it isn't a "supercomputer".