By: juanrga (nospam.delete@this.juanrga.com), January 29, 2017 5:42 am
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on January 20, 2017 2:55 pm wrote:
> gallier2 (gallier2.delete@this.gmx.de) on January 20, 2017 7:48 am wrote:
> > http://insidehpc.com/2017/01/cray-develop-arm-based-isambard-supercomputer-uk-met-office/
>
> The original Mont-blanc project had an idiotic abomination, an
> Exynos-based supercomputer. AFAICT, it has gone nowhere.
>
> While Mont Blanc still appears to be pushing ARM, they have
> found a much more effective candidate - ThunderX2.
The only "idiotic abomination" I can find is the generalized ignorance of some guys about ARM HPC in general and Mont Blanc project in particular. The exynos SoC was selected in a former prototype for exploring
If you believed that they would release a final production system based in a Exynos SoC, that is your mistake not their. Those guys know very well that those mobile SoCs aren't suitable for production machines: lack of ECC protection, slow interconnect, low-grade thermal package, 32-bit address space,... "These machines were never intended for production use".
Moreover your claim "they have found a much more effective candidate" is priceless. Says us, David, how could they explore GPU acceleration on die compared to traditional homogeneous multicore systems using a ThunderX2 SoC that is a homogeneous multicore system and lacks any iGPU?
Similar claims about the current prototype that uses a ThunderX2 SoC. This is not a production system and if "ThunderX2 actually hit targets" or not is rather irrelevant for the goals of the Mont Blanc project. If you bother to read the link that you provide, you will discover that the goals of this third phase of the project are:
No one with a minimal knowledge of the topic expects an exascale-class system based in 14nm ThunderX2 SoCs.
> That makes at least three supercomputers using ARM. Honestly, I'm skeptical of a Cavium-based system,
> until we see ThunderX2 actually hit targets. So far Cavium has a bad track record for delivering on their
> promises. However, I am very intrigued by Fujitsu's promised Exascale system. They have a good fabric
> and understand the system architecture in a way that IBM, Intel, or Nvidia do, and Cavium does not.
>
> The three ARM systems are all 'national' machines and not bid in a competitive fashion (although
> that is true of many HPC systems). I think they will be proving grounds, and if vendors can
> show real success there, they may be able to bid for more competitive systems.
>
> One issue I see is that GPUs tend to require even greater performance from the host CPU to keep
> up with Amdahl's Law. That means that a larger number of cores is less attractive compared to
> Power9 and Skylake. Also, I expect that server CPUs will start using high-performance DRAM more
> commonly. Nvidia would love to see an alternative to Xeon, but so far, they don't have one.
>
Precisely Nvidia is shipping GPGPUs on ARM-based clusters. I have seen some HPC benches specs in the past and the GPGPUs were running so fast on former ARM-based systems like on Haswell-based Xeons. The CPU was not a bottleneck then. Neither it seems the new ThunderX2 will be bottleneck. The Chief Technology Director for Extreme Computing at Bull, the company is building the Mont Blanc 3 prototype says:
It seems you also missed the announcement of Isambard, an ARM-based HPC system that will be installed at GW4.
> gallier2 (gallier2.delete@this.gmx.de) on January 20, 2017 7:48 am wrote:
> > http://insidehpc.com/2017/01/cray-develop-arm-based-isambard-supercomputer-uk-met-office/
>
> The original Mont-blanc project had an idiotic abomination, an
> Exynos-based supercomputer. AFAICT, it has gone nowhere.
>
> While Mont Blanc still appears to be pushing ARM, they have
> found a much more effective candidate - ThunderX2.
The only "idiotic abomination" I can find is the generalized ignorance of some guys about ARM HPC in general and Mont Blanc project in particular. The exynos SoC was selected in a former prototype for exploring
the challenges and benefits of deeply integrated energy-efficient processors and GPU accelerators, compared to traditional homogeneous multicore systems, and heterogeneous CPU + external GPU architectures.
If you believed that they would release a final production system based in a Exynos SoC, that is your mistake not their. Those guys know very well that those mobile SoCs aren't suitable for production machines: lack of ECC protection, slow interconnect, low-grade thermal package, 32-bit address space,... "These machines were never intended for production use".
Moreover your claim "they have found a much more effective candidate" is priceless. Says us, David, how could they explore GPU acceleration on die compared to traditional homogeneous multicore systems using a ThunderX2 SoC that is a homogeneous multicore system and lacks any iGPU?
Similar claims about the current prototype that uses a ThunderX2 SoC. This is not a production system and if "ThunderX2 actually hit targets" or not is rather irrelevant for the goals of the Mont Blanc project. If you bother to read the link that you provide, you will discover that the goals of this third phase of the project are:
- Defining the architecture of an Exascale-class compute node based on the ARM architecture, and capable of being manufactured at industrial scale;
- Assessing the available options for maximum compute efficiency;
- Developing the matching software ecosystem to pave the way for market acceptance of ARM solutions.
No one with a minimal knowledge of the topic expects an exascale-class system based in 14nm ThunderX2 SoCs.
> That makes at least three supercomputers using ARM. Honestly, I'm skeptical of a Cavium-based system,
> until we see ThunderX2 actually hit targets. So far Cavium has a bad track record for delivering on their
> promises. However, I am very intrigued by Fujitsu's promised Exascale system. They have a good fabric
> and understand the system architecture in a way that IBM, Intel, or Nvidia do, and Cavium does not.
>
> The three ARM systems are all 'national' machines and not bid in a competitive fashion (although
> that is true of many HPC systems). I think they will be proving grounds, and if vendors can
> show real success there, they may be able to bid for more competitive systems.
>
> One issue I see is that GPUs tend to require even greater performance from the host CPU to keep
> up with Amdahl's Law. That means that a larger number of cores is less attractive compared to
> Power9 and Skylake. Also, I expect that server CPUs will start using high-performance DRAM more
> commonly. Nvidia would love to see an alternative to Xeon, but so far, they don't have one.
>
Precisely Nvidia is shipping GPGPUs on ARM-based clusters. I have seen some HPC benches specs in the past and the GPGPUs were running so fast on former ARM-based systems like on Haswell-based Xeons. The CPU was not a bottleneck then. Neither it seems the new ThunderX2 will be bottleneck. The Chief Technology Director for Extreme Computing at Bull, the company is building the Mont Blanc 3 prototype says:
we expect this to be at the performance level of what you could get with an Intel Skylake Xeon or an AMD “Naples” Opteron. We think that for certain HPC applications ThunderX2 will be at that level – and sometimes, better.
though Atos has not committed to making the ThunderX2 variant of Sequana a commercial product yet, it already has two customers interested in buying them and these could be announced by the Super Computing (SC17) conference in November.
It seems you also missed the announcement of Isambard, an ARM-based HPC system that will be installed at GW4.
It's one of, if not the first serious, large(ish)-scale ARMv8 64-bit production machines. And it's the first time Cray has explicitly announced an ARMv8 product meant for more than just prototyping.