Supercomputer variant of Kahan quote

Article: Intel's Near-Threshold Voltage Computing and Applications
By: anon (anon.delete@this.anon.com), October 17, 2012 9:02 pm
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on October 17, 2012 6:56 pm wrote:
> Robert Myers (rbmyersusa.delete@this.gmail.com) on October 17, 2012 4:34 am
> wrote:
> > anon (anon.delete@this.anon.com) on October 17, 2012 1:17 am
> wrote:
> >
> > >
> >
> > > Exactly. This is why
> > > low
> bandwidth, high latency memory and
> > communications is not the problem, but
>
> > > the *solution*. Together with
> > caches and changed software
> assumptions, of
> > > course.
> > >
> >
> > Not to
> >
> mention changed physics. If you can find an advanced physics text that
> >
> ultimately does not lean heavily on the ability to go back and forth with
> >
> facility between physical and momentum space (or whatever you choose to call
> it)
> > using the transform that diagonalizes the momentum operator (the
> derivative),
> > I'll be impressed. Since you seem to think that caches and
> changed software
> > assumptions can address all problems of importance, you
> may have to be told
> > explicitly that the transform in question is the
> Fourier transform. The last
> > time I was paying close attention, Blue Gene
> could use all of 512 of its tens of
> > thousands of processors effectively in
> doing a volumetric FFT. I'll say more
> > later in the
> day.
> >
>
> According to my understanding, you are talking about IBM research
> paper from 9 years ago that investigated calculation of relatively tiny
> volumetric FFT (N=128, total dataset = 32 MB).
> BG/Q of today is very different
> machine from BG/L of 2003. Today's tightly coupled 32-node "compute drawer" is
> almost as big, when measured by FLOPs, caches or memories, as 512-node BG/L from
> then. But the question is - why bother with parallelizing such small data set
> over so many loosely coupled computing elements?
> Is it in same way similar to
> what you want to do? From one of our previous discussions on comp.arch I got the
> impression that you are interested in much bigger cubes that likely have very
> different scaling characteristic on BlueGene type of machines. And it's not
> obvious to me that their scaling characteristics are worse than small cube.
>

They are not. Larger N means the problem is inherently more parallel. For example, see:

http://code.google.com/p/p3dfft/

"P3DFFT uses 2D, or pencil, decomposition. This overcomes an important limitation to scalability inherent in FFT libraries implementing 1D (or slab) decomposition: the number of processors/tasks used to run this problem in parallel can be as large as N^2, were N is the linear problem size. This approach has shown good scalability up to 32,768 cores on Ranger (Sun/AMD at TACC) when integrated into a Direct Numerical Simulation (DNS) turbulence application (see scaling analysis presentation at Teragrid’08 meeting, Las Vegas)."

From the linked paper:

"This code has been run on Ranger at 40963 resolution using 16K cores, with 87% strong scaling for a quadrupling of core count from 4K to 16K. Testing at large core counts has also been performed on IBM BG/L and CRAY XT4’s at other major supercomputing sites, with 98% strong scaling observed between 16K and 32K cores on the former."

I'm not saying that every problem scales well, but it's simply false to claim that HPC is nothing but linpack and no real work ever gets done on them, or that it would be much more economical to invest all the money in custom CPUs. So the basis for the claim that "everybody else is doing it wrong" is already on pretty shaky ground.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: Intel's Near-Threshold ComputingDavid Kanter09/18/12 12:26 PM
  Higher SRAM voltage and shared L1Paul A. Clayton09/18/12 02:38 PM
    Higher SRAM voltage and shared L1David Kanter09/18/12 05:20 PM
      Higher SRAM voltage and shared L1Eric09/20/12 10:44 AM
        Higher SRAM voltage and shared L1David Kanter09/20/12 12:24 PM
      Yes, that kind of asynchronousPaul A. Clayton09/20/12 02:53 PM
    Higher SRAM voltage and shared L1somebody09/19/12 09:27 AM
      So micro-turboboost is doubly impracticalPaul A. Clayton09/20/12 02:53 PM
  Big littleDoug S09/18/12 03:04 PM
    Big littleDavid Kanter09/18/12 04:05 PM
    Big littleRicardo B09/19/12 04:06 AM
  New article: Intel's Near-Threshold Computingdefderdar09/18/12 09:39 PM
    New article: Intel's Near-Threshold Computingtarlinian09/19/12 08:32 AM
      New article: Intel's Near-Threshold ComputingDavid Kanter09/19/12 10:44 AM
  New article: Intel's Near-Threshold ComputingMark Christiansen09/19/12 11:31 AM
    New article: Intel's Near-Threshold ComputingChris Brodersen09/19/12 12:54 PM
  New article: Intel's Near-Threshold ComputingEric09/20/12 10:47 AM
  Latency and HPC WorkloadsRobert Myers10/03/12 10:52 AM
    Latency and HPC Workloadsanon10/03/12 06:50 PM
      Latency and HPC WorkloadsRobert Myers10/04/12 10:24 AM
        Latency and HPC WorkloadsSHK10/08/12 05:42 AM
          Latency and HPC WorkloadsMichael S10/08/12 01:59 PM
            Latency and HPC WorkloadsSHK10/08/12 02:42 PM
              Latency and HPC WorkloadsMichael S10/08/12 05:12 PM
                Latency and HPC Workloadsforestlaughing10/15/12 08:41 AM
                  The original context was Micron RLDRAM (NT)Michael S10/15/12 08:55 AM
                    The original context was Micron RLDRAMforestlaughing10/15/12 10:21 AM
              Latency and HPC Workloads - Why not SRAM?Kevin G10/09/12 09:48 AM
                Latency and HPC Workloads - Why not SRAM?Michael S10/09/12 10:33 AM
                  Latency and HPC Workloads - Why not SRAM?SHK10/09/12 12:55 PM
                    Why not SRAM? - CapacityRohit10/09/12 09:13 PM
                  Latency and HPC Workloads - Why not SRAM?Kevin G10/09/12 03:04 PM
                    Latency and HPC Workloads - Why not SRAM?Michael S10/09/12 04:52 PM
                      Latency and HPC Workloads - Why not SRAM?Robert Myers10/10/12 10:11 AM
                        Latency and HPC Workloads - Why not SRAM?forestlaughing10/15/12 08:02 AM
                          Latency and HPC Workloads - Why not SRAM?Robert Myers10/15/12 09:04 AM
                            Latency and HPC Workloads - Why not SRAM?forestlaughing10/16/12 09:13 AM
                          Latency and HPC Workloads - Why not SRAM?SHK10/16/12 08:12 AM
                    Latency and HPC Workloads - Why not SRAM?slacker10/11/12 01:35 PM
                      SRAM leakageDavid Kanter10/11/12 03:00 PM
          Latency and HPC Workloadsforestlaughing10/15/12 08:57 AM
            Latency and HPC WorkloadsRobert Myers10/16/12 07:28 AM
              Latency and HPC WorkloadsMichael S10/16/12 07:35 AM
              Latency and HPC Workloadsanon10/16/12 08:17 AM
                Latency and HPC WorkloadsRobert Myers10/16/12 09:56 AM
                  Supercomputer variant of Kahan quotePaul A. Clayton10/16/12 11:09 AM
                    Supercomputer variant of Kahan quoteanon10/17/12 01:17 AM
                      Supercomputer variant of Kahan quoteRobert Myers10/17/12 04:34 AM
                        Supercomputer variant of Kahan quoteanon10/17/12 05:12 AM
                          Supercomputer variant of Kahan quoteRobert Myers10/17/12 02:38 PM
                            Supercomputer variant of Kahan quoteanon10/17/12 05:24 PM
                              Supercomputer variant of Kahan quoteRobert Myers10/17/12 05:45 PM
                                Supercomputer variant of Kahan quoteanon10/17/12 05:58 PM
                                Supercomputer variant of Kahan quoteanon10/17/12 05:58 PM
                                  Supercomputer variant of Kahan quoteRobert Myers10/17/12 07:14 PM
                                    Supercomputer variant of Kahan quoteanon10/17/12 08:36 PM
                                      Supercomputer variant of Kahan quoteRobert Myers10/18/12 09:47 AM
                                        Supercomputer variant of Kahan quoteanon10/19/12 02:34 AM
                                          Supercomputer variant of Kahan quoteanon10/19/12 04:47 AM
                                          Supercomputer variant of Kahan quoteRobert Myers10/19/12 03:14 PM
                        Supercomputer variant of Kahan quoteMichael S10/17/12 06:56 PM
                          Supercomputer variant of Kahan quoteanon10/17/12 09:02 PM
                            Supercomputer variant of Kahan quoteRobert Myers10/18/12 01:29 PM
                              Supercomputer variant of Kahan quoteanon10/19/12 02:27 AM
                                Supercomputer variant of Kahan quoteRobert Myers10/19/12 07:24 AM
                                  Supercomputer variant of Kahan quoteanon10/19/12 08:00 AM
                                    Supercomputer variant of Kahan quoteRobert Myers10/19/12 09:28 AM
                                      Supercomputer variant of Kahan quoteanon10/19/12 10:27 AM
                              Supercomputer variant of Kahan quoteforestlaughing10/19/12 10:26 AM
                                Supercomputer variant of Kahan quoteRobert Myers10/19/12 07:04 PM
                                  Supercomputer variant of Kahan quoteEmil Briggs10/20/12 04:52 AM
                                    Supercomputer variant of Kahan quoteRobert Myers10/20/12 07:51 AM
                                      Supercomputer variant of Kahan quoteEmil Briggs10/20/12 08:33 AM
                                        Supercomputer variant of Kahan quoteEmil Briggs10/20/12 08:34 AM
                                          Supercomputer variant of Kahan quoteRobert Myers10/20/12 09:35 AM
                                            Supercomputer variant of Kahan quoteEmil Briggs10/20/12 10:04 AM
                                              Supercomputer variant of Kahan quoteRobert Myers10/20/12 11:23 AM
                  Latency and HPC Workloadsanon10/16/12 06:48 PM
                    Latency and HPC Workloadsforestlaughing10/19/12 11:43 AM
              Latency and HPC Workloadsforestlaughing10/19/12 09:38 AM
                Latency and HPC WorkloadsRobert Myers10/19/12 11:40 AM
                Potential false economics in researchPaul A. Clayton10/19/12 12:54 PM
                  Potential false economics in researchVincent Diepeveen10/20/12 08:59 AM
                  Potential false economics in researchforestlaughing10/23/12 10:56 AM
                    Potential false economics in researchRobert Myers10/23/12 07:16 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?