By: David Kanter (, November 20, 2013 9:52 am
Amiba Gelos ( on November 19, 2013 8:36 pm wrote:
> David Kanter ( on November 18, 2013 3:03 am wrote:
> > Knights Landing is Intel’s first clean sheet redesign of the Larrabee family, targeted at throughput
> > computing and manufactured on a 14nm process with products expected in late 2014 or early 2015.
> > The adoption of AVX3, on-package embedded DRAM, and bootable products have been disclosed, but
> > most details are unknown. This article analyzes the options available for the Knights Landing
> > CPU core and explains why Intel’s existing cores are a poor fit for the target workloads, concluding
> > that the most likely outcome is a new custom core for Knights Landing.
> >
> >
> >
> > Questions, comments and discussion are welcome!
> >
> > David
> Great article as usual :-)
> I still hope I can read the Jaguar, GCN articles at RWT.
> You won't forget them won't you :D

I hope not, but GCN may be tough to do at such a late date :(

> Few questions emerged while reading the article.
> First, noticed that the scalar computing capacity of Larrabee is already 4 times
> of GCN's, not to mention Fermi/Kepler which rely on CPU to execute every "less-paralleled"
> code, further improvement over Larrabee core seems to be unnecessary.

That's true if I look at it from a competitive standpoint. But when I think about it from a user and system design perspective, I have a very different conclusion.

A lot of customers have applications with a combination of serial and parallel kernels. Just because KNC handles the serial kernels better than GCN or Kepler doesn't mean it is doing a good job. The serial performance on GPUs is absolutely terrible. So it's really about how much of the users work can be handled on KNL exclusively, rather than what the competitors are doing.

> Second, I wonder whether the claim that Knights Landing is bootable really implies we need good scalar
> core. A major advantage of bootable accelerator is that by moving from a heterogeneous MP to SMP the programming
> paradigm is greatly simplified. Therefore, faster scalar execution is not really needed.

GPUs are not bootable and therefore are typically used with SNB-EP or IVB-EP as a host processor. That means there is always a very high performance CPU nearby to handle serial portions of an application. KNL is bootable and for many scenarios will not have IVB-EP nearby; therefore it must have a higher performance core to handle such things.

