Memory power and bandwidth?

Article: Computational Efficiency for CPUs and GPUs in 2012
By: David Kanter (dkanter.delete@this.realworldtech.com), August 4, 2012 10:22 am
Room: Moderated Discussions
Iain McClatchie (iain-rwt.delete@this.mcclatchie.com) on August 3, 2012 4:35 pm wrote:
> One of the big differences between CPUs and GPUs to me is their physical memory
> architecture.
>
> CPU physical memory architecture:
> CPUs come in an FBGA which
> you mount onto a motherboard with a nonsoldered really expensive socket. The
> DRAM for this system comes in FBGAs which are soldered to DIMMs, which then
> connect to the motherboard via the DIMM socket. It's usually possible to load
> two DIMMs per memory channel, and the CPU provides 1 clock pair per 8 DQs, and
> the CPU knows how to deal with registers between the CPU outputs and the DRAM
> chips. The pin data rate is something like 1 Gb/s/pin.
>
> This is good for
> configuring the system memory after the motherboard has been soldered together.
> This is bad for memory power dissipation (DQs are actively terminated and
> terminating 2 DRAM drops per CPU DQ pin consumes really large amounts of
> power).

Do you have any idea how much power DQ termination uses?

> GPU physical memory architecture:
> GPU comes in an FBGA which is
> soldered to the same board as the DRAM FBGAs. DQs are point-to-point with just
> two solder balls near the ends of the line. GPU provides 1 clock pair per 16
> DQs. The pin data rate is something like 4 Gb/s/pin.
>
> This is good for high
> bandwidth and low power, but it means you configure the memory when you solder
> everything down.

I've been told by folks who design both DDR3 and GDDR5 memory controllers that the latter is noticeably more power efficient when measured by pJ/bit. I suspect it is for many of the reasons you have outlined.

Of course, the catch is that GDDR5 latency is pretty awful. Part of that is architectural, but I suspect part is related to the things that make GDDR5 so energy efficient.

> My proposal:
> For many years now, it has seemed to me that
> CPUs should be sold as GPUs are sold, soldered onto little boards with their
> DRAM, with one-to-one data pins between CPU and DRAM. Any given CPU core/speed
> might be offered with 2 different memory loads. For example, you might be able
> to buy a 3 GHz, 4 core CPU with 8 GB or 16 GB of DRAM as a unit. This would
> double the number of SKUs shipped by CPU board manufacturers. The CPU board
> would plug into the motherboard as GPUs do now, and it's conceivable that you
> might be able to select between plugging in CPUs and GPUs.
>
> A 16 GB load of 2
> Gb chips is 72 DRAM chips, which can be implemented one-to-one with x8 DRAMs and
> 576 data pins. Obviously some of the lines will have to be somewhat long (7-8
> cm?), but I don't think that requires active termination. 32 GB/CPU package and
> larger configs would require x4 chips and buffering, and perhaps chip stacking
> for the really large memory loads.
>
> My guess is that GPUs (and their memories)
> burn much less IO power per data data bandwidth than CPUs. This proposal would
> bring CPUs up to par, and eliminate most of the expensive CPU and DIMM
> connections in the system, increasing system reliability and decreasing
> cost.
>
> From a business point-of-view, the combined product encapsulates quite
> a bit more of the high-cost portion of the system. It would lead to a big
> shakeup as DIMM and motherboard manufacturers duke it out to see who ends up
> being good at shipping a high-cost commodity with price-volatile components on
> it.

I think the other question is how would you handle servers? I think you can argue that the configuration options in modern systems is excessive and that sacrificing a bit to improve cost/power is reasonable. But would that kind of memory arrangement scale to servers where you want ~1TB of memory in the near future.

Or perhaps the real issue is that we need 3 different types of 'memory', one optimized for latency, one for bandwidth and one for capacity.

David
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New Article: Compute Efficiency 2012David Kanter07/25/12 12:37 AM
  New Article: Compute Efficiency 2012SHK07/25/12 01:31 AM
    New Article: Compute Efficiency 2012David Kanter07/25/12 01:42 AM
  New Article: Compute Efficiency 2012none07/25/12 02:18 AM
    New Article: Compute Efficiency 2012David Kanter07/25/12 10:25 AM
  GCN (NT)EBFE07/25/12 02:25 AM
    GCN - TFLOP DPjp08/09/12 12:58 PM
      GCN - TFLOP DPDavid Kanter08/09/12 02:32 PM
        GCN - TFLOP DPKevin G08/11/12 04:22 PM
      GCN - TFLOP DPEric08/09/12 04:12 PM
        GCN - TFLOP DPjp08/10/12 12:23 AM
          GCN - TFLOP DPEBFE08/12/12 07:27 PM
            GCN - TFLOP DPjp08/13/12 01:02 AM
              GCN - TFLOP DPEBFE08/13/12 06:45 PM
                GCN - TFLOP DPjp08/14/12 12:21 AM
  New Article: Compute Efficiency 2012Adrian07/25/12 03:39 AM
    New Article: Compute Efficiency 2012EBFE07/25/12 08:33 AM
    New Article: Compute Efficiency 2012David Kanter07/25/12 10:11 AM
  New Article: Compute Efficiency 2012sf07/25/12 05:46 AM
    New Article: Compute Efficiency 2012aaron spink07/25/12 08:08 AM
      New Article: Compute Efficiency 2012someone07/25/12 09:06 AM
    New Article: Compute Efficiency 2012David Kanter07/25/12 10:14 AM
      New Article: Compute Efficiency 2012EBFE07/26/12 01:27 AM
        BG/QDavid Kanter07/26/12 08:31 AM
          VR-ZONE KNC B0 leak, poor number?EBFE08/03/12 12:57 AM
            VR-ZONE KNC B0 leak, poor number?Eric08/03/12 06:59 AM
              VR-ZONE KNC B0 leak, poor number?EBFE08/04/12 05:37 AM
                VR-ZONE KNC B0 leak, poor number?aaron spink08/04/12 05:51 PM
                Leaks != productsDavid Kanter08/05/12 02:19 AM
                  Leaks != productsEBFE08/06/12 01:49 AM
                VR-ZONE KNC B0 leak, poor number?Eric08/05/12 09:37 AM
                  VR-ZONE KNC B0 leak, poor number?EBFE08/06/12 02:09 AM
                    VR-ZONE KNC B0 leak, poor number?aaron spink08/06/12 03:33 AM
                      VR-ZONE KNC B0 leak, poor number?jp08/07/12 02:08 AM
                        VR-ZONE KNC B0 leak, poor number?Eric08/07/12 03:58 AM
                          VR-ZONE KNC B0 leak, poor number?jp08/07/12 04:17 AM
                            VR-ZONE KNC B0 leak, poor number?Eric08/07/12 04:22 AM
                              VR-ZONE KNC B0 leak, poor number?anonymou508/07/12 08:43 AM
                            VR-ZONE KNC B0 leak, poor number?jp08/07/12 04:23 AM
                              VR-ZONE KNC B0 leak, poor number?aaron spink08/07/12 06:24 AM
                        VR-ZONE KNC B0 leak, poor number?aaron spink08/07/12 06:20 AM
                          VR-ZONE KNC B0 leak, poor number?jp08/07/12 10:22 AM
                            VR-ZONE KNC B0 leak, poor number?EduardoS08/07/12 02:15 PM
                        KNC has FMADavid Kanter08/07/12 08:17 AM
  New Article: Compute Efficiency 2012forestlaughing07/25/12 07:51 AM
    New Article: Compute Efficiency 2012Eric07/27/12 04:12 AM
      New Article: Compute Efficiency 2012hobold07/27/12 10:53 AM
        New Article: Compute Efficiency 2012Eric07/27/12 11:51 AM
          New Article: Compute Efficiency 2012hobold07/27/12 01:48 PM
            New Article: Compute Efficiency 2012Eric07/27/12 02:29 PM
        New Article: Compute Efficiency 2012anon07/29/12 01:25 AM
          New Article: Compute Efficiency 2012hobold07/29/12 10:53 AM
  Efficiency? No, lack of highly useful featuressomeone07/25/12 08:58 AM
    Best case for GPUsDavid Kanter07/25/12 10:28 AM
      Best case for GPUsfranzliszt07/25/12 12:39 PM
      Best case for GPUsChuck07/25/12 07:13 PM
        Best case for GPUsDavid Kanter07/25/12 08:45 PM
        Best case for GPUsEric07/27/12 04:51 AM
  Silverthorn data pointMichael S07/25/12 01:45 PM
    Silverthorn data pointDavid Kanter07/25/12 03:06 PM
  New Article: Compute Efficiency 2012Unununium07/25/12 04:55 PM
    New Article: Compute Efficiency 2012EduardoS07/25/12 07:12 PM
      Ops... I'm wrong...EduardoS07/25/12 07:14 PM
  New Article: Compute Efficiency 2012TacoBell07/25/12 07:36 PM
    New Article: Compute Efficiency 2012David Kanter07/25/12 08:49 PM
    New Article: Compute Efficiency 2012Michael S07/26/12 01:33 AM
  Line and factorMoritz07/26/12 12:34 AM
    Line and factorPeter Boyle07/27/12 06:57 AM
      not entirelyMoritz07/27/12 11:22 AM
      Line and factorEduardoS07/27/12 04:24 PM
        Line and factorMoritz07/28/12 11:52 AM
  tables Michael S07/26/12 01:39 AM
  Interlagos L2+L3Rana07/26/12 02:13 AM
    Interlagos L2+L3Rana07/26/12 02:13 AM
    Interlagos L2+L3David Kanter07/26/12 08:21 AM
      SP vs DP & performance metricsjp07/27/12 06:08 AM
        SP vs DP & performance metricsEric07/27/12 06:57 AM
          SP vs DP & performance metricsjp07/27/12 08:18 AM
            SP vs DP & performance metricsaaron spink07/27/12 08:36 AM
              SP vs DP & performance metricsjp07/27/12 08:47 AM
                "Global" --> systemPaul A. Clayton07/27/12 09:31 AM
                  "Global" --> systemjp07/27/12 02:55 PM
                    "Global" --> systemaaron spink07/27/12 06:33 PM
                      "Global" --> systemjp07/28/12 01:00 AM
                        "Global" --> systemaaron spink07/28/12 05:54 AM
                          "Global" --> systemjp07/29/12 01:12 AM
                            "Global" --> systemaaron spink07/29/12 04:03 AM
                              "Global" --> systemnone07/29/12 08:05 AM
                                "Global" --> systemEduardoS07/29/12 09:26 AM
                                "Global" --> systemjp07/30/12 01:24 AM
                                  "Global" --> systemaaron spink07/30/12 02:05 AM
                                "Global" --> systemaaron spink07/30/12 02:03 AM
                                  daxpy is STREAM TRIADPaul A. Clayton07/30/12 05:10 AM
                SP vs DP & performance metricsaaron spink07/27/12 06:25 PM
                  SP vs DP & performance metricsEmil Briggs07/28/12 05:40 AM
                    SP vs DP & performance metricsaaron spink07/28/12 06:05 AM
                      SP vs DP & performance metricsjp07/28/12 10:04 AM
                        SP vs DP & performance metricsBrett07/28/12 02:32 PM
                      SP vs DP & performance metricsEmil Briggs07/28/12 05:11 PM
                        SP vs DP & performance metricsanon07/29/12 01:53 AM
                        SP vs DP & performance metricsaaron spink07/29/12 04:39 AM
                          Coherency for discretesRohit07/29/12 08:24 AM
                          SP vs DP & performance metricsanon07/29/12 10:09 AM
                          SP vs DP & performance metricsEric07/29/12 12:08 PM
        SP vs DP & performance metricsaaron spink07/27/12 08:25 AM
  Regular updates?Joe07/27/12 08:35 AM
  New Article: Compute Efficiency 201230907/27/12 09:34 PM
  New Article: Compute Efficiency 2012Ingeneer07/30/12 08:01 AM
    New Article: Compute Efficiency 2012David Kanter07/30/12 12:11 PM
      New Article: Compute Efficiency 2012Ingeneer07/30/12 07:04 PM
        New Article: Compute Efficiency 2012David Kanter07/30/12 08:32 PM
          Memory power and bandwidth?Iain McClatchie08/03/12 03:35 PM
            Memory power and bandwidth?David Kanter08/04/12 10:22 AM
              Memory power and bandwidth?Michael S08/04/12 01:36 PM
              Memory power and bandwidth?Iain McClatchie08/06/12 01:09 PM
              Memory power and bandwidth?Eric08/07/12 05:28 PM
                WorkloadsDavid Kanter08/08/12 09:49 AM
                  WorkloadsEric08/09/12 04:21 PM
                Latency and bandwidth bottlenecks Paul A. Clayton08/08/12 03:02 PM
                  Latency and bandwidth bottlenecks Eric08/09/12 04:32 PM
                    Latency and bandwidth bottlenecks none08/10/12 05:06 AM
                  Latency and bandwidth bottlenecks -> BDPajensen08/11/12 02:21 PM
            Memory power and bandwidth?Ingeneer08/06/12 10:26 AM
  NV aims for 1.8+ TFLOPS DP ?jp08/11/12 12:21 PM
    NV aims for 1.8+ TFLOPS DP ?David Kanter08/11/12 08:25 PM
      NV aims for 1.8+ TFLOPS DP ?jp08/12/12 01:45 AM
      NV aims for 1.8+ TFLOPS DP ?EBFE08/12/12 09:02 PM
        NV aims for 1.8+ TFLOPS DP ?jp08/13/12 12:54 AM
          NV aims for 1.8+ TFLOPS DP ?Gabriele Svelto08/13/12 08:16 AM
            NV aims for 1.8+ TFLOPS DP ?Vincent Diepeveen08/14/12 02:04 AM
          NV aims for 1.8+ TFLOPS DP ?David Kanter08/13/12 08:50 AM
            NV aims for 1.8+ TFLOPS DP ?jp08/13/12 10:17 AM
        NV aims for 1.8+ TFLOPS DP ?EduardoS08/13/12 05:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?