SP vs DP & performance metrics

Article: Computational Efficiency for CPUs and GPUs in 2012
By: jp (jipe4153.delete@this.gmail.com), July 27, 2012 8:18 am
Room: Moderated Discussions
Eric (eric.kjellen.delete@this.gmail.com) on July 27, 2012 7:57 am wrote:
> jp (jipe4153.delete@this.gmail.com) on July 27, 2012 7:08 am wrote:
> > The
> idea that it's easier to
> > squeeze more theoretical performance with
> multi-threading and SSE instructions
> > on a CPU is unfortunately not true.
>
> >
>
> I agree, but the question is if the superior programmability and
> better performance in some branching and data irregular workloads (and maybe
> most importantly, more consistent performance across many workloads), together
> with Intel's growing process technology advantage and maybe also advanced
> packaging (such as TSV stacking of DRAM to enable very high memory bandwidth for
> SIMD applications) will turn out to be the killer advantages of the CPU.
>

The point is that not all but most workloads do have enough fine grained parallelism to exploit SIMD capabilities.

About bandwidth, the GPUs already have the fastest RAM out there ( over 250 GB/s ) and they have no reason not to continue this lead (read mentioned FLOPS/bandwidth ratio "issue")

> My
> answer is yes, not least because of the many historical precedents where an
> opportunity to eliminate a co-processor at the expense of raw performance and
> complexity, and to the benefit of consistency and programmability/flexibility,
> has been pursued with a great deal of enthusiasm by the industry. Let's not
> forget that the origins of the GPUs lay in their ability to accelerate 3D
> graphics when the CPU was no longer enough. They are not a natural, inescapable
> feature of general-purpose computing.
>

You just agreed that it was not easier to develop a high performance solution for a CPU. Dont you think GPU vendors will continue to churn out more features and high level abstrations to simplify development? The answer is yes, example Nvidias OpenACC and fortrans pgi compiler for CUDA.

> And for large-scale HPC deployments, we
> have already seen that throughput-optimized CPUs are (at least for the moment)
> preferred by the most demanding customers and that they can deliver
> significantly higher efficiency than CPU + GPGPU systems even with regard to
> peak theoretical performance, not to speak of the performance that will
> realistically actually be achieved.

Looking at the articles at hpcwire about new clusters over the last 1.5 years its obvious that almost everyone is buying Nvidia Tesla cards. In fact ORNL:s newest cluster (fastest in the US) will be based on the new Kepler cards (K20).

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New Article: Compute Efficiency 2012David Kanter2012/07/25 12:37 AM
  New Article: Compute Efficiency 2012SHK2012/07/25 01:31 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 01:42 AM
  New Article: Compute Efficiency 2012none2012/07/25 02:18 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 10:25 AM
  GCN (NT)EBFE2012/07/25 02:25 AM
    GCN - TFLOP DPjp2012/08/09 12:58 PM
      GCN - TFLOP DPDavid Kanter2012/08/09 02:32 PM
        GCN - TFLOP DPKevin G2012/08/11 04:22 PM
      GCN - TFLOP DPEric2012/08/09 04:12 PM
        GCN - TFLOP DPjp2012/08/10 12:23 AM
          GCN - TFLOP DPEBFE2012/08/12 07:27 PM
            GCN - TFLOP DPjp2012/08/13 01:02 AM
              GCN - TFLOP DPEBFE2012/08/13 06:45 PM
                GCN - TFLOP DPjp2012/08/14 12:21 AM
  New Article: Compute Efficiency 2012Adrian2012/07/25 03:39 AM
    New Article: Compute Efficiency 2012EBFE2012/07/25 08:33 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 10:11 AM
  New Article: Compute Efficiency 2012sf2012/07/25 05:46 AM
    New Article: Compute Efficiency 2012aaron spink2012/07/25 08:08 AM
      New Article: Compute Efficiency 2012someone2012/07/25 09:06 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 10:14 AM
      New Article: Compute Efficiency 2012EBFE2012/07/26 01:27 AM
        BG/QDavid Kanter2012/07/26 08:31 AM
          VR-ZONE KNC B0 leak, poor number?EBFE2012/08/03 12:57 AM
            VR-ZONE KNC B0 leak, poor number?Eric2012/08/03 06:59 AM
              VR-ZONE KNC B0 leak, poor number?EBFE2012/08/04 05:37 AM
                VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/04 05:51 PM
                Leaks != productsDavid Kanter2012/08/05 02:19 AM
                  Leaks != productsEBFE2012/08/06 01:49 AM
                VR-ZONE KNC B0 leak, poor number?Eric2012/08/05 09:37 AM
                  VR-ZONE KNC B0 leak, poor number?EBFE2012/08/06 02:09 AM
                    VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/06 03:33 AM
                      VR-ZONE KNC B0 leak, poor number?jp2012/08/07 02:08 AM
                        VR-ZONE KNC B0 leak, poor number?Eric2012/08/07 03:58 AM
                          VR-ZONE KNC B0 leak, poor number?jp2012/08/07 04:17 AM
                            VR-ZONE KNC B0 leak, poor number?Eric2012/08/07 04:22 AM
                              VR-ZONE KNC B0 leak, poor number?anonymou52012/08/07 08:43 AM
                            VR-ZONE KNC B0 leak, poor number?jp2012/08/07 04:23 AM
                              VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/07 06:24 AM
                        VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/07 06:20 AM
                          VR-ZONE KNC B0 leak, poor number?jp2012/08/07 10:22 AM
                            VR-ZONE KNC B0 leak, poor number?EduardoS2012/08/07 02:15 PM
                        KNC has FMADavid Kanter2012/08/07 08:17 AM
  New Article: Compute Efficiency 2012forestlaughing2012/07/25 07:51 AM
    New Article: Compute Efficiency 2012Eric2012/07/27 04:12 AM
      New Article: Compute Efficiency 2012hobold2012/07/27 10:53 AM
        New Article: Compute Efficiency 2012Eric2012/07/27 11:51 AM
          New Article: Compute Efficiency 2012hobold2012/07/27 01:48 PM
            New Article: Compute Efficiency 2012Eric2012/07/27 02:29 PM
        New Article: Compute Efficiency 2012anon2012/07/29 01:25 AM
          New Article: Compute Efficiency 2012hobold2012/07/29 10:53 AM
  Efficiency? No, lack of highly useful featuressomeone2012/07/25 08:58 AM
    Best case for GPUsDavid Kanter2012/07/25 10:28 AM
      Best case for GPUsfranzliszt2012/07/25 12:39 PM
      Best case for GPUsChuck2012/07/25 07:13 PM
        Best case for GPUsDavid Kanter2012/07/25 08:45 PM
        Best case for GPUsEric2012/07/27 04:51 AM
  Silverthorn data pointMichael S2012/07/25 01:45 PM
    Silverthorn data pointDavid Kanter2012/07/25 03:06 PM
  New Article: Compute Efficiency 2012Unununium2012/07/25 04:55 PM
    New Article: Compute Efficiency 2012EduardoS2012/07/25 07:12 PM
      Ops... I'm wrong...EduardoS2012/07/25 07:14 PM
  New Article: Compute Efficiency 2012TacoBell2012/07/25 07:36 PM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 08:49 PM
    New Article: Compute Efficiency 2012Michael S2012/07/26 01:33 AM
  Line and factorMoritz2012/07/26 12:34 AM
    Line and factorPeter Boyle2012/07/27 06:57 AM
      not entirelyMoritz2012/07/27 11:22 AM
      Line and factorEduardoS2012/07/27 04:24 PM
        Line and factorMoritz2012/07/28 11:52 AM
  tables Michael S2012/07/26 01:39 AM
  Interlagos L2+L3Rana2012/07/26 02:13 AM
    Interlagos L2+L3Rana2012/07/26 02:13 AM
    Interlagos L2+L3David Kanter2012/07/26 08:21 AM
      SP vs DP & performance metricsjp2012/07/27 06:08 AM
        SP vs DP & performance metricsEric2012/07/27 06:57 AM
          SP vs DP & performance metricsjp2012/07/27 08:18 AM
            SP vs DP & performance metricsaaron spink2012/07/27 08:36 AM
              SP vs DP & performance metricsjp2012/07/27 08:47 AM
                "Global" --> systemPaul A. Clayton2012/07/27 09:31 AM
                  "Global" --> systemjp2012/07/27 02:55 PM
                    "Global" --> systemaaron spink2012/07/27 06:33 PM
                      "Global" --> systemjp2012/07/28 01:00 AM
                        "Global" --> systemaaron spink2012/07/28 05:54 AM
                          "Global" --> systemjp2012/07/29 01:12 AM
                            "Global" --> systemaaron spink2012/07/29 04:03 AM
                              "Global" --> systemnone2012/07/29 08:05 AM
                                "Global" --> systemEduardoS2012/07/29 09:26 AM
                                "Global" --> systemjp2012/07/30 01:24 AM
                                  "Global" --> systemaaron spink2012/07/30 02:05 AM
                                "Global" --> systemaaron spink2012/07/30 02:03 AM
                                  daxpy is STREAM TRIADPaul A. Clayton2012/07/30 05:10 AM
                SP vs DP & performance metricsaaron spink2012/07/27 06:25 PM
                  SP vs DP & performance metricsEmil Briggs2012/07/28 05:40 AM
                    SP vs DP & performance metricsaaron spink2012/07/28 06:05 AM
                      SP vs DP & performance metricsjp2012/07/28 10:04 AM
                        SP vs DP & performance metricsBrett2012/07/28 02:32 PM
                      SP vs DP & performance metricsEmil Briggs2012/07/28 05:11 PM
                        SP vs DP & performance metricsanon2012/07/29 01:53 AM
                        SP vs DP & performance metricsaaron spink2012/07/29 04:39 AM
                          Coherency for discretesRohit2012/07/29 08:24 AM
                          SP vs DP & performance metricsanon2012/07/29 10:09 AM
                          SP vs DP & performance metricsEric2012/07/29 12:08 PM
        SP vs DP & performance metricsaaron spink2012/07/27 08:25 AM
  Regular updates?Joe2012/07/27 08:35 AM
  New Article: Compute Efficiency 20123092012/07/27 09:34 PM
  New Article: Compute Efficiency 2012Ingeneer2012/07/30 08:01 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/30 12:11 PM
      New Article: Compute Efficiency 2012Ingeneer2012/07/30 07:04 PM
        New Article: Compute Efficiency 2012David Kanter2012/07/30 08:32 PM
          Memory power and bandwidth?Iain McClatchie2012/08/03 03:35 PM
            Memory power and bandwidth?David Kanter2012/08/04 10:22 AM
              Memory power and bandwidth?Michael S2012/08/04 01:36 PM
              Memory power and bandwidth?Iain McClatchie2012/08/06 01:09 PM
              Memory power and bandwidth?Eric2012/08/07 05:28 PM
                WorkloadsDavid Kanter2012/08/08 09:49 AM
                  WorkloadsEric2012/08/09 04:21 PM
                Latency and bandwidth bottlenecks Paul A. Clayton2012/08/08 03:02 PM
                  Latency and bandwidth bottlenecks Eric2012/08/09 04:32 PM
                    Latency and bandwidth bottlenecks none2012/08/10 05:06 AM
                  Latency and bandwidth bottlenecks -> BDPajensen2012/08/11 02:21 PM
            Memory power and bandwidth?Ingeneer2012/08/06 10:26 AM
  NV aims for 1.8+ TFLOPS DP ?jp2012/08/11 12:21 PM
    NV aims for 1.8+ TFLOPS DP ?David Kanter2012/08/11 08:25 PM
      NV aims for 1.8+ TFLOPS DP ?jp2012/08/12 01:45 AM
      NV aims for 1.8+ TFLOPS DP ?EBFE2012/08/12 09:02 PM
        NV aims for 1.8+ TFLOPS DP ?jp2012/08/13 12:54 AM
          NV aims for 1.8+ TFLOPS DP ?Gabriele Svelto2012/08/13 08:16 AM
            NV aims for 1.8+ TFLOPS DP ?Vincent Diepeveen2012/08/14 02:04 AM
          NV aims for 1.8+ TFLOPS DP ?David Kanter2012/08/13 08:50 AM
            NV aims for 1.8+ TFLOPS DP ?jp2012/08/13 10:17 AM
        NV aims for 1.8+ TFLOPS DP ?EduardoS2012/08/13 05:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?