By: EBFE (x.delete@this.y.com), August 6, 2012 2:49 am
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on August 5, 2012 3:19 am wrote:
> > > Why
> > > do you think that? Assuming
> > that the top clock
> frequencies can actually be
> > > sustained, it's 8 DP FLOP
> >
> (512-bit vector unit) * 2 (FMA?) * 60 * 1.09 GHz =
> > > 1.046 TFLOPS.
> That's
> > in line with the 1 DP TFLOPS that Knights Corner was
> > >
> reported to clock in
> > at in DGEMM/LINPACK. TDP and RAM look OK.
> >
>
> > Linpack TFlops/300W is likely to
> > defeat next Firestream on perf
> and perf/w, or even next Fermi by
> > perf.
>
> Leaked numbers don't mean a
> whole lot, wait till products come (or at least an announcement).
>
Agreed.
> >
> However, I am assuming the following disadvantage, so I was expecting
> >
> more flops and flops/W.
>
> > 1. Coding for a target performance level is
> harder for
> > KNC than GPU.
> > KNC has to close the gap of low raw
> flops by significantly
> > higher effciency, which could be impossible or
> make coding harder.
>
> I am skeptical that coding for KNC is harder than a GPU.
> The former has caches and many other niceties of modern architectures.
>
It's likely that for KNC coding is easier and code runs more efficiently.
What I mean is that, if theoretical flops gap was big, coding to close the gap might be difficult.
> >
> 2. KNC is
> > more expensive than GPU. (SNB-EP is already near $2k)
>
> As
> Aaron pointed out, the GPUs that are used for compute workloads are just as
> expensive.
>
I was assuming (top) KNC would cost $4000+ (>2xSNB-EP).
(IOW M2090 seems 'cheap' because it is only 1.5x 2680)
Because KNC delivers "too much" flops compared to CPUs, it has to be priced higher.
> > 3. On-borad RAM is
> > smaller than competition, which may
> lower real efficiency.
>
> I'm skeptical that's true. Everyone is using GDDR5
> and the capacity is dictated by the limits of your interface (i.e. how many
> channels, clamshell support, etc.).
>
I guessed that K20 will be up to 12GiB. Possibly this 50% wouldn't matter.
> > > Why
> > > do you think that? Assuming
> > that the top clock
> frequencies can actually be
> > > sustained, it's 8 DP FLOP
> >
> (512-bit vector unit) * 2 (FMA?) * 60 * 1.09 GHz =
> > > 1.046 TFLOPS.
> That's
> > in line with the 1 DP TFLOPS that Knights Corner was
> > >
> reported to clock in
> > at in DGEMM/LINPACK. TDP and RAM look OK.
> >
>
> > Linpack TFlops/300W is likely to
> > defeat next Firestream on perf
> and perf/w, or even next Fermi by
> > perf.
>
> Leaked numbers don't mean a
> whole lot, wait till products come (or at least an announcement).
>
Agreed.
> >
> However, I am assuming the following disadvantage, so I was expecting
> >
> more flops and flops/W.
>
> > 1. Coding for a target performance level is
> harder for
> > KNC than GPU.
> > KNC has to close the gap of low raw
> flops by significantly
> > higher effciency, which could be impossible or
> make coding harder.
>
> I am skeptical that coding for KNC is harder than a GPU.
> The former has caches and many other niceties of modern architectures.
>
It's likely that for KNC coding is easier and code runs more efficiently.
What I mean is that, if theoretical flops gap was big, coding to close the gap might be difficult.
> >
> 2. KNC is
> > more expensive than GPU. (SNB-EP is already near $2k)
>
> As
> Aaron pointed out, the GPUs that are used for compute workloads are just as
> expensive.
>
I was assuming (top) KNC would cost $4000+ (>2xSNB-EP).
(IOW M2090 seems 'cheap' because it is only 1.5x 2680)
Because KNC delivers "too much" flops compared to CPUs, it has to be priced higher.
> > 3. On-borad RAM is
> > smaller than competition, which may
> lower real efficiency.
>
> I'm skeptical that's true. Everyone is using GDDR5
> and the capacity is dictated by the limits of your interface (i.e. how many
> channels, clamshell support, etc.).
>
I guessed that K20 will be up to 12GiB. Possibly this 50% wouldn't matter.



