By: Mark Roulo (nothanks.delete@this.xxx.com), August 25, 2015 4:10 pm
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on August 25, 2015 2:58 pm wrote:
> Afterthoughts of Core-M vs Cherry Trail comparison.
>
> Until yesterday I didn't look at Intel Core-M and Cherry Trail too closely, but yesterday I did.
> And what I saw? Top of the line Core-M (M-5Y71) is better than top of the
> line Cherry Trail (x7-Z8700) in every single aspect except selling price:
>
> Single Threaded Performance - ALOT better
> Graphics - ALOT better
> Multithreaded FP - ALOT better
> These two were expected
> What I didn't expect is that Core-M is also significantly ahead in multithreaded Integer tests. So
> far ahead that it seems that multithreaded Integer performance per Watt is likely about equal.
>
> Jumping from mobile to HPC.
> It means that it was very possible, and may be still possible for Intel to base their GPGPU competitor on
> Sandy Bridge derivative. Probably, not exactly Broadwell and not exactly Skylake, but slightly different
> 14nm core. Tuned for 2.5-3 GHz operation instead of 4 GHz, probably same number of execution ports as sandyB,
> less aggressive bypasses, less agressive divide units etc... In short, slightly compromised absolute performance
> relatively to Broadwell and Skylake, but almost the same performance per Watt at low frequency and at 20-30%
> smaller area. Now let's put 33 such cores running at 1.4/2.6 GHz (Base/Turbo) on a single huge die. Or, may
> be, if it fits, 39 cores with Base=1.2 GHz. Or somthing in the middle, you got the idea.
> Just like in the case of Core-M vs Cherry Trail we will get ALOT better (than KNL) single threaded
> performance, ALOT better scalar multithreaded FP and about the same multithreaded integer at about
> the same power envelop. Now, you are asking: "Who cares? this thing are important for smart customers,
> but the whole point of KNL is pleasing stupid customers by showing them that we can run LINPACK as
> fast the biggest baddest Maxwell and than slightly faster yet! You variant is not even close!".
> And here you understood why AVX512 is a huge mistake. Core-based
> GPGPU killer absolutely needs AVX-1024. Or wider.
Knights Landing gets ~6 TFLOPS single precision.
How do you get this from 33 SandyBridge cores running at 1.4 GHz?
> Afterthoughts of Core-M vs Cherry Trail comparison.
>
> Until yesterday I didn't look at Intel Core-M and Cherry Trail too closely, but yesterday I did.
> And what I saw? Top of the line Core-M (M-5Y71) is better than top of the
> line Cherry Trail (x7-Z8700) in every single aspect except selling price:
>
> Single Threaded Performance - ALOT better
> Graphics - ALOT better
> Multithreaded FP - ALOT better
> These two were expected
> What I didn't expect is that Core-M is also significantly ahead in multithreaded Integer tests. So
> far ahead that it seems that multithreaded Integer performance per Watt is likely about equal.
>
> Jumping from mobile to HPC.
> It means that it was very possible, and may be still possible for Intel to base their GPGPU competitor on
> Sandy Bridge derivative. Probably, not exactly Broadwell and not exactly Skylake, but slightly different
> 14nm core. Tuned for 2.5-3 GHz operation instead of 4 GHz, probably same number of execution ports as sandyB,
> less aggressive bypasses, less agressive divide units etc... In short, slightly compromised absolute performance
> relatively to Broadwell and Skylake, but almost the same performance per Watt at low frequency and at 20-30%
> smaller area. Now let's put 33 such cores running at 1.4/2.6 GHz (Base/Turbo) on a single huge die. Or, may
> be, if it fits, 39 cores with Base=1.2 GHz. Or somthing in the middle, you got the idea.
> Just like in the case of Core-M vs Cherry Trail we will get ALOT better (than KNL) single threaded
> performance, ALOT better scalar multithreaded FP and about the same multithreaded integer at about
> the same power envelop. Now, you are asking: "Who cares? this thing are important for smart customers,
> but the whole point of KNL is pleasing stupid customers by showing them that we can run LINPACK as
> fast the biggest baddest Maxwell and than slightly faster yet! You variant is not even close!".
> And here you understood why AVX512 is a huge mistake. Core-based
> GPGPU killer absolutely needs AVX-1024. Or wider.
Knights Landing gets ~6 TFLOPS single precision.
How do you get this from 33 SandyBridge cores running at 1.4 GHz?