Potential false economics in research

Article: Intel's Near-Threshold Voltage Computing and Applications
By: Vincent Diepeveen (diep.delete@this.xs4all.nl), October 20, 2012 8:59 am
Room: Moderated Discussions
Paul A. Clayton (paaronclayton.delete@this.gmail.com) on October 19, 2012 12:54 pm wrote:
> forestlaughing (forestlaughing.delete@this.yahoo.com) on October 19, 2012 9:38
> am wrote:
> [snip]
> > If you want high bandwidth computers, they can be built,
> you just have to bring a BIG
> > bag of money to the table. It's been done in
> the past, but these huge clusters solve
> > enough of the problems, and for
> relatively little money, that most users learn to
> > live with what they can
> afford.
> There seem to be at least three concerns related to the commonness
> of lower bandwidth (higher FLOPS/monetary unit) computers.
> First, that
> approximations to the modeled system will be developed which perform adequately
> on such systems but which may not match the system being modeled and for which
> the results are not validated by a known good model. Since validation can be
> perceived as merely an unnecessary cost and either delays release of the result
> or presents the possibility of an embarrassing retraction, there are incentives
> not to bother with validation. If the general model of the system or the input
> data is sufficiently inaccurate, inaccuracy in approximation of the model may
> not be particularly important, though such might cause an incorrect attribution
> of the failure of the simulation to the general model or the input data (when in
> fact the approximation of the model was at fault).
> (For some problems, even a
> known inaccurate model that is substantially faster or more scalable could be
> useful as a filter for exploring a decision space, but such generally assumes
> the choices that pass through the filter will be examined with an accurate
> model.)
> Second, research which requires a higher bandwidth computer to meet
> time to solution requirements may be avoided more than a strict cost-benefit
> analysis would urge. (I think part of this is the somewhat artificial time to
> solution requirements. If a result is necessary to complete a degree program or
> apply for extended funding, then slow research will tend to be excessively
> discouraged.)
> This is magnified by the issues of low volume, making high
> bandwidth computers more expensive than "necessary", reducing the number of
> researches training new researchers in that specific area, reducing the maturity
> of tools for exploring that field, etc. A lack of higher bandwidth computing
> researchers can also lead to a "vast echo chamber" effect, reaffirming to the
> lower bandwidth computing researchers that they are correct and diminishing
> consideration of alternatives.
> (I do not know if the current state is a local
> optimum that would be substantially improved by a significant investment. I do
> suspect that HPC will not fund much research and development effort for higher
> bandwidth computers and most of the effort would need to be funded for other
> concerns and applied to HPC with only modest development effort.)
> Third, a
> benchmarketing effect can direct funding of computers toward those that excel in
> more easily communicated (and measured) metrics. Linpack FLOPS is a simple
> measure of supercomputing value and can be used to establish prestige (which
> encourages donations and draws talent) or used to sell to managers as being
> worthwhile.
> This third concern can have synergy with the diminished awareness
> of the value of higher bandwidth computing. If the vast majority of researchers
> believe that a lower bandwidth computer is either well suited to their research
> or at least good enough (even if the models and algorithms used are inaccurate
> or simply not know to be accurate), then they will support efforts to fund such
> lower bandwidth computers ("it works for me" is added to the benefits of
> prestige and likeliness of receiving funding from perceived cost
> effectiveness).
> I think these three concerns are (at least part of) what
> Robert Myers is trying to argue about.
> There is some virtue in making
> lemonade when the world gives you lemons, but a focus on lemonade can disregard
> the utility of tea ("if the only tool you have is a hammer, every problem can
> look like a nail"). One might even go on to provide "tea" with ever increasing
> amounts of lemon juice and decreasing amounts of tea.
> I do not know if the
> lower bandwidth supercomputers are less useful than believed by their funders
> nor if increased funding of higher bandwidth supercomputers would be worthwhile.
> I do know that the human capacity for self-deception and the incentives for
> deceiving others (especially knowing that those being deceived would not
> understand a valid argument in support of one's position) are substantial, so I
> would guess that at least some misdirection of effort is present in HPC,
> possibly more so than in other areas (e.g., because of the difficulty of
> understanding the issues and the high level funding required [meaning "upper
> management" is heavily involved in the decision but highly removed from the
> issues]).
> [I hope this long post has not wasted too much of others' time.]

hi Paul,

If you analyze carefully what runs on the supercomputers, you will realize soon that the calculations might get the outcome the researcher in question is looking for, yet the software in general first gets slowed down factor 50-1000 in order to 'scale better'; vaste majority of software is far from efficient in calculating the desired results; sure it SEEMS as if it scales well.

I'll give 1 example where some of the largest supercomputers on this planet have been crunching on for months - and that sure was worth it.

A brilliant researcher, brilliant in his own science, figured out that the FFT used to calculate matrice, gave a round off error in the results, after some months of calculating. In Quantummechanics (where i'm not an expert obviously) sometimes the difference in results is something far behind the dot. So it's just a very small percentage of the total result which explains a specific effect.

So he just threw the matrice at the supercomputer using the slow matrix multiplication algorithm. No fourier type transforms nor whatsoever. Just raw slow manner of calculating them.

That resulted then in getting results which helped him improve (or better REFUTE new nonsense invented by researchers as a result of round off errors in quantum mechanica) theory.

This is a very typical example of a brilliant researcher as there are too few on this planet. Some of them run on supercomputers.

Yet they have to code that software themselves, so the actual science of EFFICIENT CALCULATION AND IMPLEMENTATION, is what they have to master themselves as well - which is a waste of their time obviously.

That is the real problem on those sporthall top500 machines.

From calculation efficiency viewpoint compared to how i do my calculations, they're utmost amateurs, and i see no reason why that would change - let's be happy if they manage to progress science!

I've seen a few very seldom examples of good programmers great in algorithms, who were given the task to improve such scientific software.

To quote one of them: "If you can't speedup scientific software by factor thousand, you're doing something wrong!"

Now if we keep in this 'amateur logics', the commissions that order such supercomputers, they're paid. Usually if one of those professors secretaries would order the supercomputer, that would get the best bought in supercomputer ever against the sharpest price. All managers know that bla bla stories don't work with secretaries - they have to compete then. Yet the ballgame changes when professors start to order a supercomputer. First of all it's one out of 30 commissions they sit in or so, so they hardly have time. Secondly, if one of them knows actually something technical about what a supercomputer is, that's already big progress from current situation. They do realize that getting higher in the sporthall top 500 gives them bragging points. So they go for that.

The sporthall top500 test is called linpack. Now in terms of latency between nodes, any infiniband network will be somewhere between a 0.85 microseconds and very few microseconds. Bandwidth you can effectively push through easily 2.9GB/s at a simple node. Let's ignore the 8GB/s theory that was written somewhere.

Default built in ethernet gigabit cards are practical having a latency around a 100-200 microseconds. Factor 100 slower nearly. That's PRACTICAL. SURE you CAN also buy an expensive ethernet card (in fact that would be called infiniband again as it also can work over ethernet), yet we speak here about built in ethernet. Bandwidth when i measured such machines is in the few dozens of megabytes per second for ethernet. That 'theoretic' 1 gbit i couldn't even approach within 30%. So we speak
about a factor 50 in bandwidth difference practical. Also i need to note that default built in ethernet cards are not DMA. So they give a big dang at the cpu at each received packet and basically stop all cores from working for a while.

Yet in the sporthall top500 you can find supercomputers with such simple sucking built in gigabit ethernet at 8 core Nehalem Xeon hardware and the identical network but now QDR infiniband, and for linpack it matters just factor 2 in total performance.

That's pretty small penalty.

That's the point Myers is making and it's a valid point.

Yet the real underlying reason is that government is not prepared to pay those good in software optimization - as that's commercial guys always - usually having their own company. Ever in history seen government give a contract job to a guy with his own company to optimize software? That is, without that guy first working 20 years at this department with a permanent contract.

Now with the upcoming manycores, it's becoming really complicated to write efficient code for GPU's. this really requires capable programmers - and i don't see how government is going to deal with it - knowing they never hired anyone in the past either.

The above coder i speak about who optimized scientific software, that was someone doing a PHD somewhere...

This is the sad truth about science. Some very capable researchers with a brilliant idea - they simply have no budget to hire capable coders solving their problem.

If we look to industry, nowadays not seldom with larger clusters than government has, they always just run 1 specific program - so in their case the cheapest hardware always wins as in itself a cluster never is a good investment. the software they use is a mixture of real bad engineered software and very capable software. The worst software is in the financial industry.

Who's gonna crunch with javacode at clusters you know?

Directly losing factor 100 somewhere...

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: Intel's Near-Threshold ComputingDavid Kanter2012/09/18 12:26 PM
  Higher SRAM voltage and shared L1Paul A. Clayton2012/09/18 02:38 PM
    Higher SRAM voltage and shared L1David Kanter2012/09/18 05:20 PM
      Higher SRAM voltage and shared L1Eric2012/09/20 10:44 AM
        Higher SRAM voltage and shared L1David Kanter2012/09/20 12:24 PM
      Yes, that kind of asynchronousPaul A. Clayton2012/09/20 02:53 PM
    Higher SRAM voltage and shared L1somebody2012/09/19 09:27 AM
      So micro-turboboost is doubly impracticalPaul A. Clayton2012/09/20 02:53 PM
  Big littleDoug S2012/09/18 03:04 PM
    Big littleDavid Kanter2012/09/18 04:05 PM
    Big littleRicardo B2012/09/19 04:06 AM
  New article: Intel's Near-Threshold Computingdefderdar2012/09/18 09:39 PM
    New article: Intel's Near-Threshold Computingtarlinian2012/09/19 08:32 AM
      New article: Intel's Near-Threshold ComputingDavid Kanter2012/09/19 10:44 AM
  New article: Intel's Near-Threshold ComputingMark Christiansen2012/09/19 11:31 AM
    New article: Intel's Near-Threshold ComputingChris Brodersen2012/09/19 12:54 PM
  New article: Intel's Near-Threshold ComputingEric2012/09/20 10:47 AM
  Latency and HPC WorkloadsRobert Myers2012/10/03 10:52 AM
    Latency and HPC Workloadsanon2012/10/03 06:50 PM
      Latency and HPC WorkloadsRobert Myers2012/10/04 10:24 AM
        Latency and HPC WorkloadsSHK2012/10/08 05:42 AM
          Latency and HPC WorkloadsMichael S2012/10/08 01:59 PM
            Latency and HPC WorkloadsSHK2012/10/08 02:42 PM
              Latency and HPC WorkloadsMichael S2012/10/08 05:12 PM
                Latency and HPC Workloadsforestlaughing2012/10/15 08:41 AM
                  The original context was Micron RLDRAM (NT)Michael S2012/10/15 08:55 AM
                    The original context was Micron RLDRAMforestlaughing2012/10/15 10:21 AM
              Latency and HPC Workloads - Why not SRAM?Kevin G2012/10/09 09:48 AM
                Latency and HPC Workloads - Why not SRAM?Michael S2012/10/09 10:33 AM
                  Latency and HPC Workloads - Why not SRAM?SHK2012/10/09 12:55 PM
                    Why not SRAM? - CapacityRohit2012/10/09 09:13 PM
                  Latency and HPC Workloads - Why not SRAM?Kevin G2012/10/09 03:04 PM
                    Latency and HPC Workloads - Why not SRAM?Michael S2012/10/09 04:52 PM
                      Latency and HPC Workloads - Why not SRAM?Robert Myers2012/10/10 10:11 AM
                        Latency and HPC Workloads - Why not SRAM?forestlaughing2012/10/15 08:02 AM
                          Latency and HPC Workloads - Why not SRAM?Robert Myers2012/10/15 09:04 AM
                            Latency and HPC Workloads - Why not SRAM?forestlaughing2012/10/16 09:13 AM
                          Latency and HPC Workloads - Why not SRAM?SHK2012/10/16 08:12 AM
                    Latency and HPC Workloads - Why not SRAM?slacker2012/10/11 01:35 PM
                      SRAM leakageDavid Kanter2012/10/11 03:00 PM
          Latency and HPC Workloadsforestlaughing2012/10/15 08:57 AM
            Latency and HPC WorkloadsRobert Myers2012/10/16 07:28 AM
              Latency and HPC WorkloadsMichael S2012/10/16 07:35 AM
              Latency and HPC Workloadsanon2012/10/16 08:17 AM
                Latency and HPC WorkloadsRobert Myers2012/10/16 09:56 AM
                  Supercomputer variant of Kahan quotePaul A. Clayton2012/10/16 11:09 AM
                    Supercomputer variant of Kahan quoteanon2012/10/17 01:17 AM
                      Supercomputer variant of Kahan quoteRobert Myers2012/10/17 04:34 AM
                        Supercomputer variant of Kahan quoteanon2012/10/17 05:12 AM
                          Supercomputer variant of Kahan quoteRobert Myers2012/10/17 02:38 PM
                            Supercomputer variant of Kahan quoteanon2012/10/17 05:24 PM
                              Supercomputer variant of Kahan quoteRobert Myers2012/10/17 05:45 PM
                                Supercomputer variant of Kahan quoteanon2012/10/17 05:58 PM
                                Supercomputer variant of Kahan quoteanon2012/10/17 05:58 PM
                                  Supercomputer variant of Kahan quoteRobert Myers2012/10/17 07:14 PM
                                    Supercomputer variant of Kahan quoteanon2012/10/17 08:36 PM
                                      Supercomputer variant of Kahan quoteRobert Myers2012/10/18 09:47 AM
                                        Supercomputer variant of Kahan quoteanon2012/10/19 02:34 AM
                                          Supercomputer variant of Kahan quoteanon2012/10/19 04:47 AM
                                          Supercomputer variant of Kahan quoteRobert Myers2012/10/19 03:14 PM
                        Supercomputer variant of Kahan quoteMichael S2012/10/17 06:56 PM
                          Supercomputer variant of Kahan quoteanon2012/10/17 09:02 PM
                            Supercomputer variant of Kahan quoteRobert Myers2012/10/18 01:29 PM
                              Supercomputer variant of Kahan quoteanon2012/10/19 02:27 AM
                                Supercomputer variant of Kahan quoteRobert Myers2012/10/19 07:24 AM
                                  Supercomputer variant of Kahan quoteanon2012/10/19 08:00 AM
                                    Supercomputer variant of Kahan quoteRobert Myers2012/10/19 09:28 AM
                                      Supercomputer variant of Kahan quoteanon2012/10/19 10:27 AM
                              Supercomputer variant of Kahan quoteforestlaughing2012/10/19 10:26 AM
                                Supercomputer variant of Kahan quoteRobert Myers2012/10/19 07:04 PM
                                  Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 04:52 AM
                                    Supercomputer variant of Kahan quoteRobert Myers2012/10/20 07:51 AM
                                      Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 08:33 AM
                                        Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 08:34 AM
                                          Supercomputer variant of Kahan quoteRobert Myers2012/10/20 09:35 AM
                                            Supercomputer variant of Kahan quoteEmil Briggs2012/10/20 10:04 AM
                                              Supercomputer variant of Kahan quoteRobert Myers2012/10/20 11:23 AM
                  Latency and HPC Workloadsanon2012/10/16 06:48 PM
                    Latency and HPC Workloadsforestlaughing2012/10/19 11:43 AM
              Latency and HPC Workloadsforestlaughing2012/10/19 09:38 AM
                Latency and HPC WorkloadsRobert Myers2012/10/19 11:40 AM
                Potential false economics in researchPaul A. Clayton2012/10/19 12:54 PM
                  Potential false economics in researchVincent Diepeveen2012/10/20 08:59 AM
                  Potential false economics in researchforestlaughing2012/10/23 10:56 AM
                    Potential false economics in researchRobert Myers2012/10/23 07:16 PM
Reply to this Topic
Body: No Text
How do you spell green?