ARM-based supercomputers

By: Aaron Spink (aaronspink.delete@this.notearthlink.net), January 24, 2017 11:01 am
Room: Moderated Discussions
RichardC (tich.delete@this.pobox.com) on January 24, 2017 9:34 am wrote:
> Aaron Spink (aaronspink.delete@this.notearthlink.net) on January 24, 2017 7:20 am wrote:
>
> > Well the TB+ DRAM if you followed the thread was supposed to be total system memory and
> > pointing out that phone/tablet SOCs don't have ecc on memory, which would be a problem.
>
> It's definitely nice to have ECC on DRAM, just as it is on servers. And phone SoCs don't
> have the interconnect you would want either - so it (almost certainly) isn't going to be
> an unmodified phone SoC. But it can be something quite closely related to parts of a phone
> SoC/server SoC.
>

If it isn't a phone/tablet SoC then it has no shared costs with them and will cost as much as any Xeon if not more. Even the high end Xeons are off a die that has 1M+ volume.

> > As far as memory per node goes, general ratios you see are
> > in the 1 GB of memory per 20 GFLOPs of performance
> > modulo some baseline memory overhead requirements (aka if you are only doing 1 GFLOP, you probably sill
> > need 1GB of memory...) So a 2 TFLOP node (DP) will likely require ~100GB of memory or so.
>
> That's a circular argument. For a machine which can deals with a wide variety of problems,
> it's reasonable. For a more specialized machine, it can be huge overkill. For example,
> a GTX 1080 w/ 9TFLOPS single-precision and 8GB, for a ratio of 1GB to 1125GFLOPS, which is
> about 56x away from your figure. Or consider dedicated bitcoin-mining rigs, which have a
> whole bunch of parallel-hashing ASICs and no DRAM (and in that case a fairly
> trivial interconnect such as a single USB link up to a master machine).
>
And those computers using Tesla P100s (not 1080s which lack ECC and have poor DP) are connected to cpus with 100s of GB of dram. They are constantly stream data in and out of the local memory.

> So you're assuming it's a machine which can handle the same kinds of problems as a big cluster
> of Xeon-based nodes, and then you're criticizing the ARM-based chips for not having
> the same capabilities as Xeons. But this is a more specialized architecture optimized
> to give better price-performance on a narrow subset of problems. There'd be no
> point if it was the same - being different is what makes it interesting.
>
Being different is what makes it extremely niche with low volume. That's not the market you want to try to make money in, not when you are competing against full featured Xeons, GPUs, and Xeon Phi. I highly doubt the new Mont Blanc machine is going to skimp on memory.

> > Also, the networking at 10k nodes get very expensive, you aren't going to do that for $100 per
> > node. A 48 port 10G switch will set you back 5k+ easy and you are going to need a lot of them,
> > a whole whole lot of them depending on topology.
>
> Again, you're making assumptions about what the network has to look like, and your assumptions
> are based on being able to run a wide variety of applications, some of which
> have a high ratio of communication/compute.
>
> This kind of system would probably look very different: first, it would have a large number
> of nodes on each board, e.g. 16 or 32 nodes, possibly with some cheap local interconnect
> (e.g. PCIe switch chips). PCIe can also go between boards in a rack, within reason.
> But maybe you only target applications with sufficiently low communication/compute that
> 2 x 10Gbit out of a 4U box, or between racks, is enough.
>
That's a vanishingly small subset of applications with that low of communication. Outside of crypto mining, you are unlikely to ever see it.


> Right, but the point of this is to build a system optimized for particular problems
> which don't need everything that a rackful-of-Xeon's give you.
>
> *If* you assume that you need an expensive interconnect, then the argument for using
> unorthodox high-throughput-per-dollar or high-throughput-per-watt cpu's goes away.
> So unorthodox cpu's are appropriate *only* for systems which also have an unorthodox
> (cheaper, lower-bandwidth) interconnect *and* cheaper (fewer GB per TFLOPS) DRAM.
> Which in turn means that it only works well for a subset of applications,
> but if CFD is what you need and it works for CFD, then it's all good.
>
So basically, you want to build a pure linpack machine. Even CFD requires more communication than that. For something like CFD, your comm requirements go up as you go to smaller work per node because you need to exchange more boundries more often. The reality is that most supers are already network limited for a large part of the application space. Linpack tends to be pretty much an absolute best case because it communications needs are lower than almost all applications. There is a reason something like Xeon Phi is available with 200Gb/s per node. The trend is that more and more problems are running into severe comm bottlenecks and that is what's driving the next round of supers into the 400Gb/s per node range. Hell, 10gb isn't even enough for cloud providers these days, they are all moving or have moved to 25/40/50/100.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Cray to Develop ARM-based Isambard Supercomputer for UK Met Officegallier22017/01/20 07:48 AM
  Cray to Develop ARM-based Isambard Supercomputer for UK Met Officegallier22017/01/20 07:48 AM
    Ignore second post (NT)gallier22017/01/20 07:49 AM
  ARM-based supercomputersDavid Kanter2017/01/20 02:55 PM
    ARM-based supercomputersMaynard Handley2017/01/20 08:55 PM
      ARM-based supercomputers lolRobert David Graham2017/01/21 05:34 PM
        ARM-based supercomputers lolnone2017/01/21 06:44 PM
          ARM-based supercomputers lolMaynard Handley2017/01/21 07:36 PM
            ARM-based supercomputers lolMichael S2017/01/22 02:07 AM
            What did you learn?Innocent Bystander2017/01/22 08:26 AM
          ARM-based supercomputers lolanon2017/01/22 02:24 AM
            ARM-based supercomputers lolGabriele Svelto2017/01/23 12:53 AM
              ARM-based supercomputers lolanon2017/01/23 06:40 AM
            ARM-based supercomputers loljuanrga2017/01/29 05:01 AM
              ARM-based supercomputers lolanon2017/01/29 09:35 AM
        ARM-based supercomputers lolBrendan2017/01/21 07:36 PM
         Dumb belief? ROFL (NT)juanrga2017/01/22 02:59 AM
          Dumb belief? ROFLwumpus2017/01/22 11:31 AM
            Dumb belief? ROFLjuanrga2017/01/29 05:13 AM
        ARM-based supercomputers lolDavid Kanter2017/01/22 08:20 AM
    ARM-based supercomputersRichardC2017/01/22 07:54 AM
      ARM-based supercomputersAaron Spink2017/01/22 12:03 PM
        ARM-based supercomputersNiels Jørgen Kruse2017/01/22 12:16 PM
          ARM-based supercomputersAaron Spink2017/01/23 07:01 PM
            ARM-based supercomputersNiels Jørgen Kruse2017/01/23 10:41 PM
              ARM-based supercomputerstarlinian2017/01/24 08:58 AM
                ARM-based supercomputersMichael S2017/01/24 09:05 AM
                  ARM-based supercomputersAaron Spink2017/01/24 10:38 AM
        ARM-based supercomputersdmcq2017/01/22 01:03 PM
          A73 does not have L1 ECCDavid Kanter2017/01/22 09:50 PM
            A73 does not have L1 ECCdmcq2017/01/23 03:32 PM
              A73 does not have L1 ECCMichael S2017/01/23 03:50 PM
              A73 does not have L1 ECCdmcq2017/01/24 08:15 AM
                A73 does not have L1 ECCMichael S2017/01/24 08:47 AM
          ARM SOCs with ECC DRAMMichael S2017/01/23 03:17 AM
          ARM-based supercomputersAaron Spink2017/01/23 07:02 PM
            X-Gene 3 supports 1TB of memoryvvid2017/01/24 01:55 AM
              X-Gene 3 supports 1TB of memoryWilco2017/01/24 02:13 AM
                You seem to be confusing "SoC" with "CPU core" (NT)Anon2017/01/24 03:12 AM
                X-Gene 3 supports 1TB of memoryMichael S2017/01/24 05:26 AM
                  X-Gene 3 supports 1TB of memorydmcq2017/01/24 08:28 AM
                X-Gene 3 supports 1TB of memoryAaron Spink2017/01/24 06:46 AM
                  X-Gene 3 supports 1TB of memoryWilco2017/01/24 03:10 PM
                    X-Gene 3 supports 1TB of memoryDavid Kanter2017/01/24 07:31 PM
                X-Gene 3 supports 1TB of memorySimon Farnsworth2017/01/26 02:23 AM
              X-Gene 3 supports 1TB of memoryAaron Spink2017/01/24 06:44 AM
                X-Gene 3 supports 1TB of memoryAnon2017/01/24 01:54 PM
                  X-Gene 3 supports 1TB of memoryWilco2017/01/24 03:07 PM
                    X-Gene 3 supports 1TB of memoryDavid Kanter2017/01/24 07:34 PM
                      X-Gene 3 supports 1TB of memorynone2017/01/24 10:48 PM
                        X-Gene 3 supports 1TB of memoryMichael S2017/01/25 01:32 AM
                        X-Gene 3 supports 1TB of memoryDavid Hess2017/01/25 09:47 PM
                          X-Gene 3 supports 1TB of memoryMichael S2017/01/26 12:55 AM
                            Really that different?Daniel B2017/01/26 03:37 AM
                              Really that different?none2017/01/26 05:39 AM
                                Really that different?itsmydamnation2017/01/26 03:55 PM
                                  Really that different?none2017/01/26 11:14 PM
                      X-Gene 3 supports 1TB of memoryWilco2017/01/25 03:22 AM
                        Sharing between servers and phonesDavid Kanter2017/01/25 07:15 AM
                          Sharing between servers and phonesWilco2017/01/25 04:41 PM
                            Sharing between servers and phonesDavid Kanter2017/01/25 06:10 PM
                              Sharing between servers and phonesGabriele Svelto2017/01/26 03:15 AM
                                Sharing between servers and phonesDavid Kanter2017/01/26 07:33 AM
                                  Sharing between servers and phoneswumpus2017/01/26 03:25 PM
                                    Sharing between servers and phonesDavid Kanter2017/01/27 06:46 AM
                                      Any idea why Intel doesn't ship server chips with eDRAM?Mark Roulo2017/01/27 09:02 AM
                                        Any idea why Intel doesn't ship server chips with eDRAM?Per Hesselgren2017/01/28 06:49 AM
                                          Any idea why Intel doesn't ship server chips with eDRAM?Simon Farnsworth2017/01/28 07:06 AM
                                          Any idea why Intel doesn't ship server chips with eDRAM?Michael S2017/01/29 02:43 AM
                                            off die eDRAM?wumpus2017/01/29 06:57 AM
                                              off die eDRAM?anon2017/01/29 09:46 AM
                                      Sharing between servers and phonesrwessel2017/01/27 10:36 PM
                                      Sharing between servers and phonesLinus Torvalds2017/01/28 11:49 AM
                                        Sharing between servers and phonesrwessel2017/01/29 08:56 PM
                                          Sharing between servers and phonesLinus Torvalds2017/01/30 10:01 AM
                                            Sharing between servers and phonesrwessel2017/01/31 12:29 AM
                                              Sharing between servers and phonesIreland2017/01/31 07:55 AM
                                                Please keep your posts on topicDavid Kanter2017/01/31 11:32 AM
                                                  Technology that can survive in harsh conditionsIreland2017/01/31 12:51 PM
                                                    Stay on topic, you have been warnedDavid Kanter2017/01/31 06:56 PM
                                                On topic summary and my thoughts on it. Jouni Osmala2017/01/31 01:10 PM
                                                  On topic summary and my thoughts on it. Ireland2017/01/31 01:27 PM
                                              Sharing between servers and phonesLinus Torvalds2017/01/31 01:01 PM
                                                Sharing between servers and phonesLinus Torvalds2017/01/31 01:49 PM
                                                  Sharing between servers and phonesDoug S2017/01/31 02:12 PM
                                                Sharing between servers and phonesrwessel2017/01/31 05:54 PM
                                                  Sharing between servers and phonesLinus Torvalds2017/02/01 09:17 AM
                                                    Sharing between servers and phonesrwessel2017/02/02 03:40 PM
                                                  Sharing between servers and phonesjoncmu2017/02/01 01:36 PM
                                                    Sharing between servers and phonesChristian Borntraeger2017/02/02 03:46 AM
            ARM-based supercomputersRichardC2017/01/24 05:50 AM
              ARM-based supercomputersAaron Spink2017/01/24 07:20 AM
                ARM-based supercomputersdmcq2017/01/24 08:44 AM
                  ARM-based supercomputersnone2017/01/24 09:10 AM
                  ARM-based supercomputersjuanrga2017/01/29 05:33 AM
                    ARM-based supercomputershobel2017/01/30 02:35 AM
                ARM-based supercomputersRichardC2017/01/24 09:34 AM
                  ARM-based supercomputersdmcq2017/01/24 10:10 AM
                  ARM-based supercomputersAaron Spink2017/01/24 11:01 AM
                    ARM-based supercomputersRichardC2017/01/24 04:06 PM
                      ARM-based supercomputersIreland2017/01/24 05:16 PM
                      ARM-based supercomputersAaron Spink2017/01/24 07:43 PM
                    video renderingRichardC2017/01/24 05:08 PM
                      video renderingIreland2017/01/24 05:26 PM
                      video renderingAaron Spink2017/01/24 07:54 PM
                        video renderingRichardC2017/01/25 04:26 AM
                          display bandwidthRichardC2017/01/25 05:30 AM
                          video renderingIreland2017/01/25 08:11 AM
                            You can keep a coal furnace fed all the time. (NT)anon2017/01/25 03:27 PM
                              You can keep a coal furnace fed all the time.Ireland2017/01/25 03:36 PM
                                You can keep a coal furnace fed all the time.anon2017/01/27 03:29 AM
                                  You can keep a coal furnace fed all the time.Michael S2017/01/27 07:22 AM
                                    You can keep a coal furnace fed all the time.Ireland2017/01/27 10:59 AM
                                    You can keep a coal furnace fed all the time.anon2017/01/27 03:09 PM
                                      Information and Super Materials Ireland2017/01/28 11:13 AM
                            video renderingRichardC2017/01/26 11:39 AM
                              video renderingIreland2017/01/26 12:49 PM
                                video renderingIreland2017/01/26 12:58 PM
                                video renderingRichardC2017/01/26 03:24 PM
                                  video renderingIreland2017/01/26 05:09 PM
                          video renderingGabriele Svelto2017/01/25 08:16 AM
                            video renderingIreland2017/01/25 08:33 AM
                              The challenge at Pixar Ireland2017/01/25 08:40 AM
                                Pixar story & technologyRichard Cownie2017/01/25 11:36 AM
                                  Pixar story & technologyIreland2017/01/25 11:58 AM
                          video renderingAaron Spink2017/01/25 02:49 PM
                            video renderingIreland2017/01/25 03:26 PM
                            cloud network infrastructureRichardC2017/01/26 10:47 AM
                              cloud network infrastructureAaron Spink2017/01/26 07:37 PM
                ARM-based supercomputersGabriele Svelto2017/01/24 12:40 PM
                  ARM-based supercomputersAaron Spink2017/01/24 08:00 PM
        ARM-based supercomputersRichardC2017/01/22 09:45 PM
          ARM-based supercomputersGabriele Svelto2017/01/23 01:03 AM
            ARM-based supercomputersRichardC2017/01/23 06:57 AM
      ARM-based supercomputersGabriele Svelto2017/01/23 03:23 AM
    Mont blanc project and ARM HPC in generaljuanrga2017/01/29 04:42 AM
      Juan, why do you have to be such an arrogant twat to the person who provides this forum? (NT)Annoyed2017/01/30 02:32 AM
      Still idioticDavid Kanter2017/01/30 07:49 AM
        Still idioticMr. Camel2017/01/30 08:16 PM
        Still idioticjuanrga2017/02/02 08:11 AM
      Mont blanc project and ARM HPC in generalIreland2017/01/31 04:53 PM
        Four Things to Consider Ireland2017/01/31 05:11 PM
          Four Things to Consider tarlinian2017/01/31 06:38 PM
            Four Things to Consider Ireland2017/01/31 06:58 PM
              no evidence that it goes on a shipRichardC2017/02/01 06:05 AM
                no evidence that it goes on a shipIreland2017/02/02 01:57 PM
                  no evidence that it goes on a shipRichardC2017/02/03 06:04 AM
                    no evidence that it goes on a shipIreland2017/02/03 08:02 AM
              A better place to site a supercomputerAnon2017/02/01 06:57 AM
                A better place to site a supercomputerIreland2017/02/01 06:37 PM
                  A better Ireland..Anon2017/02/01 07:52 PM
                    A better Ireland..slacker2017/02/01 10:32 PM
                    A better Ireland..Ireland2017/02/02 04:06 AM
                    He is manic (NT)anonymo2017/02/02 12:21 PM
                      or a botanon2017/02/02 02:17 PM
                        or a botanonymou52017/02/02 03:46 PM
                        or a botnone2017/02/02 10:45 PM
                          or a botanon2017/02/03 09:30 AM
                        Better bot example: amanfrommars1Doug S2017/02/03 10:44 AM
            Make allowances, the man never sleeps..Anon2017/01/31 08:22 PM
        Mont blanc project and ARM HPC in generaletudiant2017/02/02 08:39 AM
          Mont blanc project and ARM HPC in generalIreland2017/02/02 11:12 AM
            Connection between two different modelsIreland2017/02/02 11:18 AM
              Connection between two different modelsetudiant2017/02/04 02:40 PM
                Connection between two different modelsRichardC2017/02/06 08:36 AM
                  Connection between two different modelsdmcq2017/02/06 10:07 AM
                  Connection between two different modelsIreland2017/02/06 11:17 AM
                    probably not the right forum ...RichardC2017/02/06 12:31 PM
                      probably not the right forum ...Ireland2017/02/06 01:53 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?