Supercomputer variant of Kahan quote

Article: Intel's Near-Threshold Voltage Computing and Applications
By: anon (anon.delete@this.anon.com), October 19, 2012 2:27 am
Room: Moderated Discussions
Robert Myers (rbmyersusa.delete@this.gmail.com) on October 18, 2012 1:29 pm wrote:
> anon (anon.delete@this.anon.com) on October 17, 2012 9:02 pm wrote:
> >
> Michael S (already5chosen.delete@this.yahoo.com) on October 17, 2012 6:56 pm
>
>
> > >
> > > According to my understanding, you are talking about
> IBM
> > research
> > > paper from 9 years ago that investigated
> calculation of
> > relatively tiny
> > > volumetric FFT (N=128, total
> dataset = 32 MB).
> > > BG/Q
> > of today is very different
> > >
> machine from BG/L of 2003. Today's tightly
> > coupled 32-node "compute
> drawer" is
> > > almost as big, when measured by
> > FLOPs, caches or
> memories, as 512-node BG/L from
> > > then. But the question is
> > - why
> bother with parallelizing such small data set
> > > over so many loosely
>
> > coupled computing elements?
> > > Is it in same way similar to
>
> > > what you
> > want to do? From one of our previous discussions on
> comp.arch I got the
> > >
> > impression that you are interested in much
> bigger cubes that likely have very
> >
> > > different scaling
> characteristic on BlueGene type of machines. And it's
> > not
> > >
> obvious to me that their scaling characteristics are worse than small
> >
> cube.
> > >
> >
> > They are not. Larger N means the problem is
> inherently more
> > parallel. For example, see:
> >
> >
> http://code.google.com/p/p3dfft/
> >
> > "P3DFFT uses
> > 2D, or pencil,
> decomposition. This overcomes an important limitation to
> > scalability
> inherent in FFT libraries implementing 1D (or slab) decomposition:
> > the
> number of processors/tasks used to run this problem in parallel can be as
> >
> large as N^2, were N is the linear problem size. This approach has shown good
>
> > scalability up to 32,768 cores on Ranger (Sun/AMD at TACC) when
> integrated into
> > a Direct Numerical Simulation (DNS) turbulence
> application (see scaling analysis
> > presentation at Teragrid’08 meeting,
> Las Vegas)."
> >
> > From the linked
> > paper:
> >
> > "This code
> has been run on Ranger at 40963 resolution using 16K cores,
> > with 87%
> strong scaling for a quadrupling of core count from 4K to 16K. Testing
> > at
> large core counts has also been performed on IBM BG/L and CRAY XT4’s at
> >
> other major supercomputing sites, with 98% strong scaling observed between 16K
>
> > and 32K cores on the former."
> >
> > I'm not saying that every
> problem scales well,
> > but it's simply false to claim that HPC is nothing
> but linpack and no real work
> > ever gets done on them, or that it would be
> much more economical to invest all
> > the money in custom CPUs. So the basis
> for the claim that "everybody else is
> > doing it wrong" is already on
> pretty shaky ground.
> >
> I would appreciate it very much if you would stop
> putting words into my mouth. At no point have I ever said what you just
> "quoted" me as saying.

I did not quote you as saying, I paraphrased. But it is pretty much what you're saying. I can quite easily quote you if you need any reminding.

> In another post, I have explained what my real position
> is with respect to these super-gigantic but not-so-super computers.
>
> For one
> thing, I feel like I keep having the same discussion over and over again. I get
> that feeling because I *am* having the same discussion over and over again. The
> only thing that changes is that the numbers change because the microelectronics
> change and because the number of nodes that are being jammed into a
> high-bandwidth connection (a board, a drawer, or a cabinet, or whatever) keeps
> increasing. Those boards, drawers, and cabinets are *exactly* what has been
> discussed on comp.arch as the only arrangement actually capable of doing
> problems that require lots of bandwidth. The instant you get off those boards
> and out into the warehouse, which is where you need to be to get the linpack
> flops that are being advertised, you have the bandwidth problem that I have been
> belly-aching about for now almost a decade.
>
> The discussion runs something
> like this: "What are you talking about? There are these super-duper numbers
> from Supercomputing Whatever, and Blue Gene did just great." When you dig into
> the numbers, you find that, yes, indeed, Blue Gene didn't do as badly scaling to
> a few thousand nodes as the other entrants, but they were *all* running at about
> 10% of the Linpack rate. Since then, I'm sure there have been more
> Supercomputer Whatevers, and an entirely new set of numbers. These benchmarks
> and claims show only that things aren't quite as bad as they were when I first
> flipped over Blue Gene. The advertisement is essentially bait and switch: flops
> and efficiency are for Linpack with the whole machine, FFT's are tested with
> whatever subset of the machine doesn't look too horrible. There *is* no
> exascale computing for anything but nearly embarrassingly parallel calculations.
> I'm tired of these archaeological digs. Yes, the numbers will change. The
> scalability of real problems will slowly get better, but the improvement has
> nothing, nothing, nothing to do with warehouses filled with computer cabinets
> that can perform billions and billions of flops on a problem that I am assured
> is still important, even if it is of no interest to me.

Cough up the numbers, sir. Instead of handwaving, let's just see some of your numbers. Cite some studies or published results.

For example, you claimed (without citing anything of course) that 3d FFTs don't scale well, but at least for some problems, they actually do, as this study shows. I can frame it as a reply to an exact quote of yours just a couple of posts up, if you would like.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: Intel's Near-Threshold ComputingDavid Kanter2012/09/18 12:26 PM
  Higher SRAM voltage and shared L1Paul A. Clayton2012/09/18 02:38 PM
    Higher SRAM voltage and shared L1David Kanter2012/09/18 05:20 PM
      Higher SRAM voltage and shared L1Eric2012/09/20 10:44 AM
        Higher SRAM voltage and shared L1David Kanter2012/09/20 12:24 PM
      Yes, that kind of asynchronousPaul A. Clayton2012/09/20 02:53 PM
    Higher SRAM voltage and shared L1somebody2012/09/19 09:27 AM
      So micro-turboboost is doubly impracticalPaul A. Clayton2012/09/20 02:53 PM
  Big littleDoug S2012/09/18 03:04 PM
    Big littleDavid Kanter2012/09/18 04:05 PM
    Big littleRicardo B2012/09/19 04:06 AM
  New article: Intel's Near-Threshold Computingdefderdar2012/09/18 09:39 PM
    New article: Intel's Near-Threshold Computingtarlinian2012/09/19 08:32 AM
      New article: Intel's Near-Threshold ComputingDavid Kanter2012/09/19 10:44 AM
  New article: Intel's Near-Threshold ComputingMark Christiansen2012/09/19 11:31 AM
    New article: Intel's Near-Threshold ComputingChris Brodersen2012/09/19 12:54 PM
  New article: Intel's Near-Threshold ComputingEric2012/09/20 10:47 AM
  Latency and HPC WorkloadsRobert Myers2012/10/03 10:52 AM
    Latency and HPC Workloadsanon2012/10/03 06:50 PM
      Latency and HPC WorkloadsRobert Myers2012/10/04 10:24 AM
        Latency and HPC WorkloadsSHK2012/10/08 05:42 AM
          Latency and HPC WorkloadsMichael S2012/10/08 01:59 PM
            Latency and HPC WorkloadsSHK2012/10/08 02:42 PM
              Latency and HPC WorkloadsMichael S2012/10/08 05:12 PM
                Latency and HPC Workloadsforestlaughing2012/10/15 08:41 AM
                  The original context was Micron RLDRAM (NT)Michael S2012/10/15 08:55 AM
                    The original context was Micron RLDRAMforestlaughing2012/10/15 10:21 AM
              Latency and HPC Workloads - Why not SRAM?Kevin G2012/10/09 09:48 AM
                Latency and HPC Workloads - Why not SRAM?Michael S2012/10/09 10:33 AM
                  Latency and HPC Workloads - Why not SRAM?SHK2012/10/09 12:55 PM
                    Why not SRAM? - CapacityRohit2012/10/09 09:13 PM
                  Latency and HPC Workloads - Why not SRAM?Kevin G2012/10/09 03:04 PM
                    Latency and HPC Workloads - Why not SRAM?Michael S2012/10/09 04:52 PM
                      Latency and HPC Workloads - Why not SRAM?Robert Myers2012/10/10 10:11 AM
                        Latency and HPC Workloads - Why not SRAM?forestlaughing2012/10/15 08:02 AM
                          Latency and HPC Workloads - Why not SRAM?Robert Myers2012/10/15 09:04 AM
                            Latency and HPC Workloads - Why not SRAM?forestlaughing2012/10/16 09:13 AM
                          Latency and HPC Workloads - Why not SRAM?SHK2012/10/16 08:12 AM
                    Latency and HPC Workloads - Why not SRAM?slacker2012/10/11 01:35 PM
                      SRAM leakageDavid Kanter2012/10/11 03:00 PM
          Latency and HPC Workloadsforestlaughing2012/10/15 08:57 AM
            Latency and HPC WorkloadsRobert Myers2012/10/16 07:28 AM
              Latency and HPC WorkloadsMichael S2012/10/16 07:35 AM
              Latency and HPC Workloadsanon2012/10/16 08:17 AM
                Latency and HPC WorkloadsRobert Myers2012/10/16 09:56 AM
                  Supercomputer variant of Kahan quotePaul A. Clayton2012/10/16 11:09 AM
                    Supercomputer variant of Kahan quoteanon2012/10/17 01:17 AM
                      Supercomputer variant of Kahan quoteRobert Myers2012/10/17 04:34 AM
                        Supercomputer variant of Kahan quoteanon2012/10/17 05:12 AM
                          Supercomputer variant of Kahan quoteRobert Myers2012/10/17 02:38 PM
                            Supercomputer variant of Kahan quoteanon2012/10/17 05:24 PM
                              Supercomputer variant of Kahan quoteRobert Myers2012/10/17 05:45 PM
                                Supercomputer variant of Kahan quoteanon2012/10/17 05:58 PM
                                Supercomputer variant of Kahan quoteanon2012/10/17 05:58 PM
                                  Supercomputer variant of Kahan quoteRobert Myers2012/10/17 07:14 PM
                                    Supercomputer variant of Kahan quoteanon2012/10/17 08:36 PM
                                      Supercomputer variant of Kahan quoteRobert Myers2012/10/18 09:47 AM
                                        Supercomputer variant of Kahan quoteanon2012/10/19 02:34 AM
                                          Supercomputer variant of Kahan quoteanon2012/10/19 04:47 AM
                                          Supercomputer variant of Kahan quoteRobert Myers2012/10/19 03:14 PM
                        Supercomputer variant of Kahan quoteMichael S2012/10/17 06:56 PM
                          Supercomputer variant of Kahan quoteanon2012/10/17 09:02 PM
                            Supercomputer variant of Kahan quoteRobert Myers2012/10/18 01:29 PM
                              Supercomputer variant of Kahan quoteanon2012/10/19 02:27 AM
                                Supercomputer variant of Kahan quoteRobert Myers2012/10/19 07:24 AM
                                  Supercomputer variant of Kahan quoteanon2012/10/19 08:00 AM
                                    Supercomputer variant of Kahan quoteRobert Myers2012/10/19 09:28 AM
                                      Supercomputer variant of Kahan quoteanon2012/10/19 10:27 AM
                              Supercomputer variant of Kahan quoteforestlaughing2012/10/19 10:26 AM
                                Supercomputer variant of Kahan quoteRobert Myers2012/10/19 07:04 PM
                                  Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 04:52 AM
                                    Supercomputer variant of Kahan quoteRobert Myers2012/10/20 07:51 AM
                                      Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 08:33 AM
                                        Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 08:34 AM
                                          Supercomputer variant of Kahan quoteRobert Myers2012/10/20 09:35 AM
                                            Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 10:04 AM
                                              Supercomputer variant of Kahan quoteRobert Myers2012/10/20 11:23 AM
                  Latency and HPC Workloadsanon2012/10/16 06:48 PM
                    Latency and HPC Workloadsforestlaughing2012/10/19 11:43 AM
              Latency and HPC Workloadsforestlaughing2012/10/19 09:38 AM
                Latency and HPC WorkloadsRobert Myers2012/10/19 11:40 AM
                Potential false economics in researchPaul A. Clayton2012/10/19 12:54 PM
                  Potential false economics in researchVincent Diepeveen2012/10/20 08:59 AM
                  Potential false economics in researchforestlaughing2012/10/23 10:56 AM
                    Potential false economics in researchRobert Myers2012/10/23 07:16 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?