By: RichardC (tich.delete@this.pobox.com), February 3, 2017 7:04 am
Room: Moderated Discussions
Ireland (boh.delete@this.outlook.ie) on February 2, 2017 1:57 pm wrote:
> RichardC (tich.delete@this.pobox.com) on February 1, 2017 6:05 am wrote:
> >
> > Maybe those obstacles
> > will be too big to overcome; or maybe there will be a few cracks in the wall which allow
> > low-end architecture X to creep through in a few niches; or maybe the higher-end architecture
> > will grow down far enough to fend off the threat (Xeon-D is certainly a good step towards a
> > lower-TCO x86 for throughput-optimized clusters).
> >
>
> Richard,
>
> This is the last part of my analysis. It's the part that hasn't been included in the conversation thus far,
> about Cray. Cray are one of the collaborative party in this, whose view point, we haven't even spoken about.
> What is the benefit from their point of view being involved in a project to build the ARM-based supercomputer,
> involving the various universities in south-west Britain, and the UK weather forecasting service? running on a variety
I've been in the business of building and selling parallel scientific supercomputers/clusters, back
in the mid-1990s. Each customer has a big pile of money to spend - e.g. $20M or more - and a very
complex set of requirements (and if it's government money, then a further very rigorous set of
procurement and bidding rules to ensure that all vendors get the exact same information and play by
the same rules). Then you chase after that pot of money with whatever technologies you've got (and for a large enough pot of money, making a bid which involves stretching some way beyond the current technology).
So with that background, in the Isambard project it's clear that the customer has a pot of money
essentially for research into performance of various algorithms across different architectures,
and they want to have the same software stack running on a variety of hardware - including
ARM+GPGPU - to figure out which hardware architecture will give the best bang-for-the-buck for their future needs, with minimal effort in porting the apps.
Cray builds them an ARM+GPGPU system, and ports their software stack, for a smallish pot of money
(maybe not much more than the cost of that development). If it works well, then they have the inside track to bid on a big pot of money for a full-scale next-generation system in the UK - and
also sell similar systems to many other national weather-forecasting organizations, and offhand
I don't know how big that market is but it could be in the hundred of millions of dollars.
So: the Met Office wants to figure out the best/cheapest way to meet future computing needs; Cray
wants to grab as much as possible of the Met Office's cash, *and* develop a product which can
compete well in the global market.
None of this requires a complex handwaving explanation. None of it has anything to do with
micro-services and XML. And none of it involves putting supercomputing into harsh environments -
most likely they would be in a very boring machine room in a very boring office building on the
outskirts of a very boring, but relatively cheap pounds-per-square-foot, town like Reading or
Swindon.
And it doesn't have much to do with flooding either, except inasmuch as the high costs of
unpredicted extreme weather may be pushing the government to give the Met Office more funding to improve the speed and accuracy of forecasting.
Evidently some people in the Met Office think ARM+GPGPU may be a superior approach for their
particular problems; and evidently Cray agrees with them enough to be at least willing to devote
scarce development effort to giving it a try.
> RichardC (tich.delete@this.pobox.com) on February 1, 2017 6:05 am wrote:
> >
> > Maybe those obstacles
> > will be too big to overcome; or maybe there will be a few cracks in the wall which allow
> > low-end architecture X to creep through in a few niches; or maybe the higher-end architecture
> > will grow down far enough to fend off the threat (Xeon-D is certainly a good step towards a
> > lower-TCO x86 for throughput-optimized clusters).
> >
>
> Richard,
>
> This is the last part of my analysis. It's the part that hasn't been included in the conversation thus far,
> about Cray. Cray are one of the collaborative party in this, whose view point, we haven't even spoken about.
> What is the benefit from their point of view being involved in a project to build the ARM-based supercomputer,
> involving the various universities in south-west Britain, and the UK weather forecasting service? running on a variety
I've been in the business of building and selling parallel scientific supercomputers/clusters, back
in the mid-1990s. Each customer has a big pile of money to spend - e.g. $20M or more - and a very
complex set of requirements (and if it's government money, then a further very rigorous set of
procurement and bidding rules to ensure that all vendors get the exact same information and play by
the same rules). Then you chase after that pot of money with whatever technologies you've got (and for a large enough pot of money, making a bid which involves stretching some way beyond the current technology).
So with that background, in the Isambard project it's clear that the customer has a pot of money
essentially for research into performance of various algorithms across different architectures,
and they want to have the same software stack running on a variety of hardware - including
ARM+GPGPU - to figure out which hardware architecture will give the best bang-for-the-buck for their future needs, with minimal effort in porting the apps.
Cray builds them an ARM+GPGPU system, and ports their software stack, for a smallish pot of money
(maybe not much more than the cost of that development). If it works well, then they have the inside track to bid on a big pot of money for a full-scale next-generation system in the UK - and
also sell similar systems to many other national weather-forecasting organizations, and offhand
I don't know how big that market is but it could be in the hundred of millions of dollars.
So: the Met Office wants to figure out the best/cheapest way to meet future computing needs; Cray
wants to grab as much as possible of the Met Office's cash, *and* develop a product which can
compete well in the global market.
None of this requires a complex handwaving explanation. None of it has anything to do with
micro-services and XML. And none of it involves putting supercomputing into harsh environments -
most likely they would be in a very boring machine room in a very boring office building on the
outskirts of a very boring, but relatively cheap pounds-per-square-foot, town like Reading or
Swindon.
And it doesn't have much to do with flooding either, except inasmuch as the high costs of
unpredicted extreme weather may be pushing the government to give the Met Office more funding to improve the speed and accuracy of forecasting.
Evidently some people in the Met Office think ARM+GPGPU may be a superior approach for their
particular problems; and evidently Cray agrees with them enough to be at least willing to devote
scarce development effort to giving it a try.