By: Ireland (boh.delete@this.outlook.ie), February 2, 2017 2:57 pm
Room: Moderated Discussions
RichardC (tich.delete@this.pobox.com) on February 1, 2017 6:05 am wrote:
>
> Maybe those obstacles
> will be too big to overcome; or maybe there will be a few cracks in the wall which allow
> low-end architecture X to creep through in a few niches; or maybe the higher-end architecture
> will grow down far enough to fend off the threat (Xeon-D is certainly a good step towards a
> lower-TCO x86 for throughput-optimized clusters).
>
Richard,
This is the last part of my analysis. It's the part that hasn't been included in the conversation thus far, about Cray. Cray are one of the collaborative party in this, whose view point, we haven't even spoken about. What is the benefit from their point of view being involved in a project to build the ARM-based supercomputer, involving the various universities in south-west Britain, and the UK weather forecasting service?
However, allow me to back up a little, and put it into a context - so you might be able to think about it in a software engineering sense. The real 'obstacle' from a point of view of super computers nowadays, is running costs and not capital costs. We'd probably see a lot more supercomputers receiving a green light (i.e. better from a Cray corporation's point of view), were it not for those projects getting cancelled for reasons of operation, and not capital costs.
I don't know whether you'd agree with me, on that or not.
There has been an effort made in Britain, for about a decade now, with the civil engineering profession (i.e. not so much software/computing engineering side), to work on models needed to understand 'impacts' of future weather events. They needed to create customized, localized analysis that would allow us to understand 'impacts' of climatic events on population centers. They're trying to bring in that viewpoint, into the simulations, in addition to the weather side of it. You can read the replies further down, for my take on this.
I think I mentioned it to you before. None of this work carried out in modeling by the civil engineering side, has been really exposed to the benefits of supercomputing, or higher end computing of any description really, thus far. I know that a lot of this research has been carried out on laptops and so on, to date. It's got no input really, from any engineering discipline input, in high performance computing. That's just how it happened.
These events and 'impacts' that I'm talking about, are large. One has to think on large terms, as in New Orleans type of scale. I.e. The kinds of things, that one hopes will not happen, but they might. One can have combinations of events - combinations of tides at the wrong height, a lot of rainfall that result in swollen river systems, and things like gale force winds coming into the event at the same time. The civil engineers as I explained, they're good at looking at these local factors - and that work is happening on that end. However, it's about getting these predictive capabilities of different agencies, different disciplines, to communicate in order to try and merge it into one picture. That system feeds into planning of projects, disaster-prevention infrastructure and investment over the longer term.
What we've got at the moment, is the micro-services concept, the software engineering that is based around a lot of XML processing. We're already doing a lot of this, where the Intel solutions are involved, in both hardware and software. The Xeon-D systems that you mention are excellent at providing the platform that engineers in the micro-services, and Agile, extreme programming (going all the ways back to Ward Cunningham, and going all the ways back to early object-oriented programming, where it began with Smalltalk etc. The 'Triangulation 239: Ward Cunningham' interview from February 2016, is a good listen if you ever get a chance.
What I'm saying, is that the software engineers learned how to build very intelligent, distributed systems. However, they reached for the tools that they understood best also, and that's where they can run into a dead-end in terms of need to scale up. You've got two branches of engineering - the Agile programmers and the civil engineers - neither of whom have been exposed to the kinds of technology yet, that might be needed to take things like weather forecasting and environmental impact study to another level.
Again, you may not agree with me, on that score either. I'd be interested in you're take, in any case.
I'll refer for example, to some observations made by James Gosling. He would look at something like XML processing, around which we have built our large distributed, cloud services. He worked out a low-level comparison, on the efficiency of doing things in that way, as opposed to how he might approach the task, working on the Java platform.
What we tend to run into a lot these days, are these architectures that rely heavily on the XML/Agile development approach. And yes, we're really accommodated at the moment in what the Intel solutions, the Xeon-D clusters etc can offer in terms of hardware/software in that regard. What we're talking about really, is providing connectivity and translation between discreet models, that are created and backed up by large amounts of data, using these systems that the Intel hardware can accelerate.
The question is, how does that scale? I mean, when you get up to heavy-weight translation work loads. Where one might want to switch out the solution of the XML-processing system in the middle, and replace it with something more high-end, to work a lot faster.
That is, how does one design that connectivity, in order to make the predictive model on the weather forecasting side, and the impact-analysis (i.e. the civil engineers model), appear as though they are working as one? I mean, do we want to build that around micro-services, Agile (extreme) programming and XML acceleration. Or do we go about scaling up that system, using a different approach? You're better able to answer and discuss that side of it, than I possibly can.
You mentioned a need for a part of the infrastructure, that could be based upon increased parallelism. The kind of workload that I'm talking about, the connectivity between the two different predictive models - is more like a telecommunications workload (Johan De Gelas, refers to it too sometimes in his server architecture review articles at Anand). It's a streaming type of workload, a connectivity type of workload, where you want these two models to be able to bounce back and forth, is what I mean. It's a small part of the whole equation I'll grant you. But it's something in between the two x86 cluster end-points, where you might want to insert something else other than your x86 solution. This is what I'm suggesting to you. From a point of view of the two x86 cluster endpoints, they don't really care what's going on in the middle. Except that it can be made to work fast, and reliable.
>
> Maybe those obstacles
> will be too big to overcome; or maybe there will be a few cracks in the wall which allow
> low-end architecture X to creep through in a few niches; or maybe the higher-end architecture
> will grow down far enough to fend off the threat (Xeon-D is certainly a good step towards a
> lower-TCO x86 for throughput-optimized clusters).
>
Richard,
This is the last part of my analysis. It's the part that hasn't been included in the conversation thus far, about Cray. Cray are one of the collaborative party in this, whose view point, we haven't even spoken about. What is the benefit from their point of view being involved in a project to build the ARM-based supercomputer, involving the various universities in south-west Britain, and the UK weather forecasting service?
However, allow me to back up a little, and put it into a context - so you might be able to think about it in a software engineering sense. The real 'obstacle' from a point of view of super computers nowadays, is running costs and not capital costs. We'd probably see a lot more supercomputers receiving a green light (i.e. better from a Cray corporation's point of view), were it not for those projects getting cancelled for reasons of operation, and not capital costs.
I don't know whether you'd agree with me, on that or not.
There has been an effort made in Britain, for about a decade now, with the civil engineering profession (i.e. not so much software/computing engineering side), to work on models needed to understand 'impacts' of future weather events. They needed to create customized, localized analysis that would allow us to understand 'impacts' of climatic events on population centers. They're trying to bring in that viewpoint, into the simulations, in addition to the weather side of it. You can read the replies further down, for my take on this.
I think I mentioned it to you before. None of this work carried out in modeling by the civil engineering side, has been really exposed to the benefits of supercomputing, or higher end computing of any description really, thus far. I know that a lot of this research has been carried out on laptops and so on, to date. It's got no input really, from any engineering discipline input, in high performance computing. That's just how it happened.
These events and 'impacts' that I'm talking about, are large. One has to think on large terms, as in New Orleans type of scale. I.e. The kinds of things, that one hopes will not happen, but they might. One can have combinations of events - combinations of tides at the wrong height, a lot of rainfall that result in swollen river systems, and things like gale force winds coming into the event at the same time. The civil engineers as I explained, they're good at looking at these local factors - and that work is happening on that end. However, it's about getting these predictive capabilities of different agencies, different disciplines, to communicate in order to try and merge it into one picture. That system feeds into planning of projects, disaster-prevention infrastructure and investment over the longer term.
What we've got at the moment, is the micro-services concept, the software engineering that is based around a lot of XML processing. We're already doing a lot of this, where the Intel solutions are involved, in both hardware and software. The Xeon-D systems that you mention are excellent at providing the platform that engineers in the micro-services, and Agile, extreme programming (going all the ways back to Ward Cunningham, and going all the ways back to early object-oriented programming, where it began with Smalltalk etc. The 'Triangulation 239: Ward Cunningham' interview from February 2016, is a good listen if you ever get a chance.
What I'm saying, is that the software engineers learned how to build very intelligent, distributed systems. However, they reached for the tools that they understood best also, and that's where they can run into a dead-end in terms of need to scale up. You've got two branches of engineering - the Agile programmers and the civil engineers - neither of whom have been exposed to the kinds of technology yet, that might be needed to take things like weather forecasting and environmental impact study to another level.
Again, you may not agree with me, on that score either. I'd be interested in you're take, in any case.
I'll refer for example, to some observations made by James Gosling. He would look at something like XML processing, around which we have built our large distributed, cloud services. He worked out a low-level comparison, on the efficiency of doing things in that way, as opposed to how he might approach the task, working on the Java platform.
What we tend to run into a lot these days, are these architectures that rely heavily on the XML/Agile development approach. And yes, we're really accommodated at the moment in what the Intel solutions, the Xeon-D clusters etc can offer in terms of hardware/software in that regard. What we're talking about really, is providing connectivity and translation between discreet models, that are created and backed up by large amounts of data, using these systems that the Intel hardware can accelerate.
The question is, how does that scale? I mean, when you get up to heavy-weight translation work loads. Where one might want to switch out the solution of the XML-processing system in the middle, and replace it with something more high-end, to work a lot faster.
That is, how does one design that connectivity, in order to make the predictive model on the weather forecasting side, and the impact-analysis (i.e. the civil engineers model), appear as though they are working as one? I mean, do we want to build that around micro-services, Agile (extreme) programming and XML acceleration. Or do we go about scaling up that system, using a different approach? You're better able to answer and discuss that side of it, than I possibly can.
You mentioned a need for a part of the infrastructure, that could be based upon increased parallelism. The kind of workload that I'm talking about, the connectivity between the two different predictive models - is more like a telecommunications workload (Johan De Gelas, refers to it too sometimes in his server architecture review articles at Anand). It's a streaming type of workload, a connectivity type of workload, where you want these two models to be able to bounce back and forth, is what I mean. It's a small part of the whole equation I'll grant you. But it's something in between the two x86 cluster end-points, where you might want to insert something else other than your x86 solution. This is what I'm suggesting to you. From a point of view of the two x86 cluster endpoints, they don't really care what's going on in the middle. Except that it can be made to work fast, and reliable.