Memory power and bandwidth?

Article: Computational Efficiency for CPUs and GPUs in 2012
By: David Kanter (dkanter.delete@this.realworldtech.com), August 4, 2012 10:22 am
Room: Moderated Discussions
Iain McClatchie (iain-rwt.delete@this.mcclatchie.com) on August 3, 2012 4:35 pm wrote:
> One of the big differences between CPUs and GPUs to me is their physical memory
> architecture.
>
> CPU physical memory architecture:
> CPUs come in an FBGA which
> you mount onto a motherboard with a nonsoldered really expensive socket. The
> DRAM for this system comes in FBGAs which are soldered to DIMMs, which then
> connect to the motherboard via the DIMM socket. It's usually possible to load
> two DIMMs per memory channel, and the CPU provides 1 clock pair per 8 DQs, and
> the CPU knows how to deal with registers between the CPU outputs and the DRAM
> chips. The pin data rate is something like 1 Gb/s/pin.
>
> This is good for
> configuring the system memory after the motherboard has been soldered together.
> This is bad for memory power dissipation (DQs are actively terminated and
> terminating 2 DRAM drops per CPU DQ pin consumes really large amounts of
> power).

Do you have any idea how much power DQ termination uses?

> GPU physical memory architecture:
> GPU comes in an FBGA which is
> soldered to the same board as the DRAM FBGAs. DQs are point-to-point with just
> two solder balls near the ends of the line. GPU provides 1 clock pair per 16
> DQs. The pin data rate is something like 4 Gb/s/pin.
>
> This is good for high
> bandwidth and low power, but it means you configure the memory when you solder
> everything down.

I've been told by folks who design both DDR3 and GDDR5 memory controllers that the latter is noticeably more power efficient when measured by pJ/bit. I suspect it is for many of the reasons you have outlined.

Of course, the catch is that GDDR5 latency is pretty awful. Part of that is architectural, but I suspect part is related to the things that make GDDR5 so energy efficient.

> My proposal:
> For many years now, it has seemed to me that
> CPUs should be sold as GPUs are sold, soldered onto little boards with their
> DRAM, with one-to-one data pins between CPU and DRAM. Any given CPU core/speed
> might be offered with 2 different memory loads. For example, you might be able
> to buy a 3 GHz, 4 core CPU with 8 GB or 16 GB of DRAM as a unit. This would
> double the number of SKUs shipped by CPU board manufacturers. The CPU board
> would plug into the motherboard as GPUs do now, and it's conceivable that you
> might be able to select between plugging in CPUs and GPUs.
>
> A 16 GB load of 2
> Gb chips is 72 DRAM chips, which can be implemented one-to-one with x8 DRAMs and
> 576 data pins. Obviously some of the lines will have to be somewhat long (7-8
> cm?), but I don't think that requires active termination. 32 GB/CPU package and
> larger configs would require x4 chips and buffering, and perhaps chip stacking
> for the really large memory loads.
>
> My guess is that GPUs (and their memories)
> burn much less IO power per data data bandwidth than CPUs. This proposal would
> bring CPUs up to par, and eliminate most of the expensive CPU and DIMM
> connections in the system, increasing system reliability and decreasing
> cost.
>
> From a business point-of-view, the combined product encapsulates quite
> a bit more of the high-cost portion of the system. It would lead to a big
> shakeup as DIMM and motherboard manufacturers duke it out to see who ends up
> being good at shipping a high-cost commodity with price-volatile components on
> it.

I think the other question is how would you handle servers? I think you can argue that the configuration options in modern systems is excessive and that sacrificing a bit to improve cost/power is reasonable. But would that kind of memory arrangement scale to servers where you want ~1TB of memory in the near future.

Or perhaps the real issue is that we need 3 different types of 'memory', one optimized for latency, one for bandwidth and one for capacity.

David
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New Article: Compute Efficiency 2012David Kanter2012/07/25 12:37 AM
  New Article: Compute Efficiency 2012SHK2012/07/25 01:31 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 01:42 AM
  New Article: Compute Efficiency 2012none2012/07/25 02:18 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 10:25 AM
  GCN (NT)EBFE2012/07/25 02:25 AM
    GCN - TFLOP DPjp2012/08/09 12:58 PM
      GCN - TFLOP DPDavid Kanter2012/08/09 02:32 PM
        GCN - TFLOP DPKevin G2012/08/11 04:22 PM
      GCN - TFLOP DPEric2012/08/09 04:12 PM
        GCN - TFLOP DPjp2012/08/10 12:23 AM
          GCN - TFLOP DPEBFE2012/08/12 07:27 PM
            GCN - TFLOP DPjp2012/08/13 01:02 AM
              GCN - TFLOP DPEBFE2012/08/13 06:45 PM
                GCN - TFLOP DPjp2012/08/14 12:21 AM
  New Article: Compute Efficiency 2012Adrian2012/07/25 03:39 AM
    New Article: Compute Efficiency 2012EBFE2012/07/25 08:33 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 10:11 AM
  New Article: Compute Efficiency 2012sf2012/07/25 05:46 AM
    New Article: Compute Efficiency 2012aaron spink2012/07/25 08:08 AM
      New Article: Compute Efficiency 2012someone2012/07/25 09:06 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 10:14 AM
      New Article: Compute Efficiency 2012EBFE2012/07/26 01:27 AM
        BG/QDavid Kanter2012/07/26 08:31 AM
          VR-ZONE KNC B0 leak, poor number?EBFE2012/08/03 12:57 AM
            VR-ZONE KNC B0 leak, poor number?Eric2012/08/03 06:59 AM
              VR-ZONE KNC B0 leak, poor number?EBFE2012/08/04 05:37 AM
                VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/04 05:51 PM
                Leaks != productsDavid Kanter2012/08/05 02:19 AM
                  Leaks != productsEBFE2012/08/06 01:49 AM
                VR-ZONE KNC B0 leak, poor number?Eric2012/08/05 09:37 AM
                  VR-ZONE KNC B0 leak, poor number?EBFE2012/08/06 02:09 AM
                    VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/06 03:33 AM
                      VR-ZONE KNC B0 leak, poor number?jp2012/08/07 02:08 AM
                        VR-ZONE KNC B0 leak, poor number?Eric2012/08/07 03:58 AM
                          VR-ZONE KNC B0 leak, poor number?jp2012/08/07 04:17 AM
                            VR-ZONE KNC B0 leak, poor number?Eric2012/08/07 04:22 AM
                              VR-ZONE KNC B0 leak, poor number?anonymou52012/08/07 08:43 AM
                            VR-ZONE KNC B0 leak, poor number?jp2012/08/07 04:23 AM
                              VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/07 06:24 AM
                        VR-ZONE KNC B0 leak, poor number?aaron spink2012/08/07 06:20 AM
                          VR-ZONE KNC B0 leak, poor number?jp2012/08/07 10:22 AM
                            VR-ZONE KNC B0 leak, poor number?EduardoS2012/08/07 02:15 PM
                        KNC has FMADavid Kanter2012/08/07 08:17 AM
  New Article: Compute Efficiency 2012forestlaughing2012/07/25 07:51 AM
    New Article: Compute Efficiency 2012Eric2012/07/27 04:12 AM
      New Article: Compute Efficiency 2012hobold2012/07/27 10:53 AM
        New Article: Compute Efficiency 2012Eric2012/07/27 11:51 AM
          New Article: Compute Efficiency 2012hobold2012/07/27 01:48 PM
            New Article: Compute Efficiency 2012Eric2012/07/27 02:29 PM
        New Article: Compute Efficiency 2012anon2012/07/29 01:25 AM
          New Article: Compute Efficiency 2012hobold2012/07/29 10:53 AM
  Efficiency? No, lack of highly useful featuressomeone2012/07/25 08:58 AM
    Best case for GPUsDavid Kanter2012/07/25 10:28 AM
      Best case for GPUsfranzliszt2012/07/25 12:39 PM
      Best case for GPUsChuck2012/07/25 07:13 PM
        Best case for GPUsDavid Kanter2012/07/25 08:45 PM
        Best case for GPUsEric2012/07/27 04:51 AM
  Silverthorn data pointMichael S2012/07/25 01:45 PM
    Silverthorn data pointDavid Kanter2012/07/25 03:06 PM
  New Article: Compute Efficiency 2012Unununium2012/07/25 04:55 PM
    New Article: Compute Efficiency 2012EduardoS2012/07/25 07:12 PM
      Ops... I'm wrong...EduardoS2012/07/25 07:14 PM
  New Article: Compute Efficiency 2012TacoBell2012/07/25 07:36 PM
    New Article: Compute Efficiency 2012David Kanter2012/07/25 08:49 PM
    New Article: Compute Efficiency 2012Michael S2012/07/26 01:33 AM
  Line and factorMoritz2012/07/26 12:34 AM
    Line and factorPeter Boyle2012/07/27 06:57 AM
      not entirelyMoritz2012/07/27 11:22 AM
      Line and factorEduardoS2012/07/27 04:24 PM
        Line and factorMoritz2012/07/28 11:52 AM
  tables Michael S2012/07/26 01:39 AM
  Interlagos L2+L3Rana2012/07/26 02:13 AM
    Interlagos L2+L3Rana2012/07/26 02:13 AM
    Interlagos L2+L3David Kanter2012/07/26 08:21 AM
      SP vs DP & performance metricsjp2012/07/27 06:08 AM
        SP vs DP & performance metricsEric2012/07/27 06:57 AM
          SP vs DP & performance metricsjp2012/07/27 08:18 AM
            SP vs DP & performance metricsaaron spink2012/07/27 08:36 AM
              SP vs DP & performance metricsjp2012/07/27 08:47 AM
                "Global" --> systemPaul A. Clayton2012/07/27 09:31 AM
                  "Global" --> systemjp2012/07/27 02:55 PM
                    "Global" --> systemaaron spink2012/07/27 06:33 PM
                      "Global" --> systemjp2012/07/28 01:00 AM
                        "Global" --> systemaaron spink2012/07/28 05:54 AM
                          "Global" --> systemjp2012/07/29 01:12 AM
                            "Global" --> systemaaron spink2012/07/29 04:03 AM
                              "Global" --> systemnone2012/07/29 08:05 AM
                                "Global" --> systemEduardoS2012/07/29 09:26 AM
                                "Global" --> systemjp2012/07/30 01:24 AM
                                  "Global" --> systemaaron spink2012/07/30 02:05 AM
                                "Global" --> systemaaron spink2012/07/30 02:03 AM
                                  daxpy is STREAM TRIADPaul A. Clayton2012/07/30 05:10 AM
                SP vs DP & performance metricsaaron spink2012/07/27 06:25 PM
                  SP vs DP & performance metricsEmil Briggs2012/07/28 05:40 AM
                    SP vs DP & performance metricsaaron spink2012/07/28 06:05 AM
                      SP vs DP & performance metricsjp2012/07/28 10:04 AM
                        SP vs DP & performance metricsBrett2012/07/28 02:32 PM
                      SP vs DP & performance metricsEmil Briggs2012/07/28 05:11 PM
                        SP vs DP & performance metricsanon2012/07/29 01:53 AM
                        SP vs DP & performance metricsaaron spink2012/07/29 04:39 AM
                          Coherency for discretesRohit2012/07/29 08:24 AM
                          SP vs DP & performance metricsanon2012/07/29 10:09 AM
                          SP vs DP & performance metricsEric2012/07/29 12:08 PM
        SP vs DP & performance metricsaaron spink2012/07/27 08:25 AM
  Regular updates?Joe2012/07/27 08:35 AM
  New Article: Compute Efficiency 20123092012/07/27 09:34 PM
  New Article: Compute Efficiency 2012Ingeneer2012/07/30 08:01 AM
    New Article: Compute Efficiency 2012David Kanter2012/07/30 12:11 PM
      New Article: Compute Efficiency 2012Ingeneer2012/07/30 07:04 PM
        New Article: Compute Efficiency 2012David Kanter2012/07/30 08:32 PM
          Memory power and bandwidth?Iain McClatchie2012/08/03 03:35 PM
            Memory power and bandwidth?David Kanter2012/08/04 10:22 AM
              Memory power and bandwidth?Michael S2012/08/04 01:36 PM
              Memory power and bandwidth?Iain McClatchie2012/08/06 01:09 PM
              Memory power and bandwidth?Eric2012/08/07 05:28 PM
                WorkloadsDavid Kanter2012/08/08 09:49 AM
                  WorkloadsEric2012/08/09 04:21 PM
                Latency and bandwidth bottlenecks Paul A. Clayton2012/08/08 03:02 PM
                  Latency and bandwidth bottlenecks Eric2012/08/09 04:32 PM
                    Latency and bandwidth bottlenecks none2012/08/10 05:06 AM
                  Latency and bandwidth bottlenecks -> BDPajensen2012/08/11 02:21 PM
            Memory power and bandwidth?Ingeneer2012/08/06 10:26 AM
  NV aims for 1.8+ TFLOPS DP ?jp2012/08/11 12:21 PM
    NV aims for 1.8+ TFLOPS DP ?David Kanter2012/08/11 08:25 PM
      NV aims for 1.8+ TFLOPS DP ?jp2012/08/12 01:45 AM
      NV aims for 1.8+ TFLOPS DP ?EBFE2012/08/12 09:02 PM
        NV aims for 1.8+ TFLOPS DP ?jp2012/08/13 12:54 AM
          NV aims for 1.8+ TFLOPS DP ?Gabriele Svelto2012/08/13 08:16 AM
            NV aims for 1.8+ TFLOPS DP ?Vincent Diepeveen2012/08/14 02:04 AM
          NV aims for 1.8+ TFLOPS DP ?David Kanter2012/08/13 08:50 AM
            NV aims for 1.8+ TFLOPS DP ?jp2012/08/13 10:17 AM
        NV aims for 1.8+ TFLOPS DP ?EduardoS2012/08/13 05:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?