Patrick Chase ( on November 23, 2013 4:33 pm wrote:
> Most real HPC systems have dedicated I/O nodes for that sort of thing, whether it be
> the Xeon/Opteron in "Xeon/Opteron + Tesla/Phi", or dedicated cores in BG/Q systems.
> That isn't really a reason to constrain the architecture of the *compute* nodes.

I agree that that is true for most of the people doing accelerators.

But the whole point of Xeon Phi is that the compute node is a "normal CPU", so that it's easier to write software for (and use older software with minimal changes). That's very much what differentiates it from the systems that try to use GPU's etc.

So you're pretty much expected to run a real OS on those nodes. You don't have to do so, of course, but it does seem to be one of the main usage models. And I think that's the argument Intel makes for it - not only are the compute units regular full CPU's, they are x86 CPU's, so people are expected to have an easy time migrating from some previous cluster-of-pc setup. Sure, you'll want to recompile and do some extra work to really take advantage of the wider vectors, but it's still a much smaller and more incremental step than moving to OpenCL and special nodes for feeding the compute units.

And Intel really does seem to be pushing this angle, talking about how KNL is a standalone CPU, not some add-in accelerator card. See for example

or straight from Intel PR:

and if you go this approach (which would seem to have real advantages), you definitely want to have "good enough" performance on single thread loads, because you're not having something else feed the data to you by hand any more.

And the old Atom really was pretty bad at some general-purpose stuff. That VR-zone link says KNL is 72 modified Silvermont cores, so it should be much better in that regard.

