Latency and HPC Workloads

Article: Intel's Near-Threshold Voltage Computing and Applications
By: Michael S (already5chosen.delete@this.yahoo.com), October 8, 2012 5:12 pm
Room: Moderated Discussions
SHK (nomail.delete@this.mail.com) on October 8, 2012 2:42 pm wrote:
> [snip]
>

> First, you will have to convince Intel/AMD to include
> support for
> RLDRAM into their IMCs. According to my understanding, RLDRAM access
> protocol
> is substantially different from SDRAM DDR2/DDR3, or any SDRAM for that
> matter,
> so the required effort is not trivial.
>

>
> Sure, but i was talking about the
> higher end of the performance scale, not commodity systems. Something like the
> big POWER systems,

Big POWER systems use fully-buffered memories. So, latency wouldn't be as good as commodity boxen regardless of interface you use between buffer and memory device.
Besides, I don't know we they do the scheduling, in the controller, i.e in Power chip or in the buffer itself. If the former, then you face the same problem as with Intel/AMD.

> BlueGene

For previous generations of BlueGene RLDRAM looks like a nice fit. But current generation packs quite a few cores in a single die, so they, too, are starting to want state-of-the-art capacity per pin.

> or a vector system like the NEC's SX-series. IIRC
> the SX-6 was available either in a "high capacity" DRAM version or "low latency"
> FCRAM main memory. Dunno if SX-9 or future NEC systems will have the same config
> option.
>
>

> Second, RLDRAM would be pretty
> bad for capacity, except if your
> opt for some form of fully buffered, which by
> itself adds more latency than
> RLDRAM saves.
> If I am not mistaken, with RLDRAM you can't currently get more
> than 512 MB per 64bit "channel". That's like going almost full decade back.
> With standard unbuffered DDR3 you can easily get 8 GB per channel, with
> registered DDR3 - up to 96 GB/channel.
>

>
> On micron.com RLDRAM3 is marked as
> 576Mbit density vs 8Gbit of commodity DDR3 which is kinda disappointing.

And that's only 1/3rd of the problem.
Another one is absence of x8 and x4 parts.
And third one is the # of supported ranks on the same data bus. If I am not mistaken, RLDRAM only supports 2 ranks per data bus. DDR3 supports 4 ranks on the same bus with no rate compromises. With slower rate - quite a few more. I don't know how much exactly. Something like 12?

> Is the
> tradeoff capacity vs latency inevitabile like in caches?

It sounds that way.

> Maybe switching to a
> FLASH+RLDRAM (o similar) would be a better compromise?

Per chip, flash is ~ two orders of magnitude away from SDRAM in write bandwidth. And when you start to aggregate many chips on the same bus to improve write bandwidth, you inevitably hurt read latency in the process. And, BTW, for NAND flash read latency is pretty bad to start with. As to NOR flash, assuming it serves as 2-nd level storage behid few GBs of RLDRAM "cache", its read latency is good enough, but write bandwidth is pretty horrible, even by comparison to NAND flash. Besides, right now NOR flash faces major difficulties with scaling to finer silicon geometries.
"Or similar" sounds good, but so far the most promising technologies (PCRAM and MRAM) did not deliver - the former on write bandwidth, the later on density.

>
> I'm asking because i
> remember papers from more than 10 years ago on the "memory and power wall" but,
>
> while there have been a lot of power efficency improvements, the memory wall
> keeps getting worse,
> and larger caches, prefetch and all that (IMHO!) seems
> only a palliative solution to the fundamental
> problem of redesigning a more
> balanced memory-ALU interconnection.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: Intel's Near-Threshold ComputingDavid Kanter09/18/12 12:26 PM
  Higher SRAM voltage and shared L1Paul A. Clayton09/18/12 02:38 PM
    Higher SRAM voltage and shared L1David Kanter09/18/12 05:20 PM
      Higher SRAM voltage and shared L1Eric09/20/12 10:44 AM
        Higher SRAM voltage and shared L1David Kanter09/20/12 12:24 PM
      Yes, that kind of asynchronousPaul A. Clayton09/20/12 02:53 PM
    Higher SRAM voltage and shared L1somebody09/19/12 09:27 AM
      So micro-turboboost is doubly impracticalPaul A. Clayton09/20/12 02:53 PM
  Big littleDoug S09/18/12 03:04 PM
    Big littleDavid Kanter09/18/12 04:05 PM
    Big littleRicardo B09/19/12 04:06 AM
  New article: Intel's Near-Threshold Computingdefderdar09/18/12 09:39 PM
    New article: Intel's Near-Threshold Computingtarlinian09/19/12 08:32 AM
      New article: Intel's Near-Threshold ComputingDavid Kanter09/19/12 10:44 AM
  New article: Intel's Near-Threshold ComputingMark Christiansen09/19/12 11:31 AM
    New article: Intel's Near-Threshold ComputingChris Brodersen09/19/12 12:54 PM
  New article: Intel's Near-Threshold ComputingEric09/20/12 10:47 AM
  Latency and HPC WorkloadsRobert Myers10/03/12 10:52 AM
    Latency and HPC Workloadsanon10/03/12 06:50 PM
      Latency and HPC WorkloadsRobert Myers10/04/12 10:24 AM
        Latency and HPC WorkloadsSHK10/08/12 05:42 AM
          Latency and HPC WorkloadsMichael S10/08/12 01:59 PM
            Latency and HPC WorkloadsSHK10/08/12 02:42 PM
              Latency and HPC WorkloadsMichael S10/08/12 05:12 PM
                Latency and HPC Workloadsforestlaughing10/15/12 08:41 AM
                  The original context was Micron RLDRAM (NT)Michael S10/15/12 08:55 AM
                    The original context was Micron RLDRAMforestlaughing10/15/12 10:21 AM
              Latency and HPC Workloads - Why not SRAM?Kevin G10/09/12 09:48 AM
                Latency and HPC Workloads - Why not SRAM?Michael S10/09/12 10:33 AM
                  Latency and HPC Workloads - Why not SRAM?SHK10/09/12 12:55 PM
                    Why not SRAM? - CapacityRohit10/09/12 09:13 PM
                  Latency and HPC Workloads - Why not SRAM?Kevin G10/09/12 03:04 PM
                    Latency and HPC Workloads - Why not SRAM?Michael S10/09/12 04:52 PM
                      Latency and HPC Workloads - Why not SRAM?Robert Myers10/10/12 10:11 AM
                        Latency and HPC Workloads - Why not SRAM?forestlaughing10/15/12 08:02 AM
                          Latency and HPC Workloads - Why not SRAM?Robert Myers10/15/12 09:04 AM
                            Latency and HPC Workloads - Why not SRAM?forestlaughing10/16/12 09:13 AM
                          Latency and HPC Workloads - Why not SRAM?SHK10/16/12 08:12 AM
                    Latency and HPC Workloads - Why not SRAM?slacker10/11/12 01:35 PM
                      SRAM leakageDavid Kanter10/11/12 03:00 PM
          Latency and HPC Workloadsforestlaughing10/15/12 08:57 AM
            Latency and HPC WorkloadsRobert Myers10/16/12 07:28 AM
              Latency and HPC WorkloadsMichael S10/16/12 07:35 AM
              Latency and HPC Workloadsanon10/16/12 08:17 AM
                Latency and HPC WorkloadsRobert Myers10/16/12 09:56 AM
                  Supercomputer variant of Kahan quotePaul A. Clayton10/16/12 11:09 AM
                    Supercomputer variant of Kahan quoteanon10/17/12 01:17 AM
                      Supercomputer variant of Kahan quoteRobert Myers10/17/12 04:34 AM
                        Supercomputer variant of Kahan quoteanon10/17/12 05:12 AM
                          Supercomputer variant of Kahan quoteRobert Myers10/17/12 02:38 PM
                            Supercomputer variant of Kahan quoteanon10/17/12 05:24 PM
                              Supercomputer variant of Kahan quoteRobert Myers10/17/12 05:45 PM
                                Supercomputer variant of Kahan quoteanon10/17/12 05:58 PM
                                Supercomputer variant of Kahan quoteanon10/17/12 05:58 PM
                                  Supercomputer variant of Kahan quoteRobert Myers10/17/12 07:14 PM
                                    Supercomputer variant of Kahan quoteanon10/17/12 08:36 PM
                                      Supercomputer variant of Kahan quoteRobert Myers10/18/12 09:47 AM
                                        Supercomputer variant of Kahan quoteanon10/19/12 02:34 AM
                                          Supercomputer variant of Kahan quoteanon10/19/12 04:47 AM
                                          Supercomputer variant of Kahan quoteRobert Myers10/19/12 03:14 PM
                        Supercomputer variant of Kahan quoteMichael S10/17/12 06:56 PM
                          Supercomputer variant of Kahan quoteanon10/17/12 09:02 PM
                            Supercomputer variant of Kahan quoteRobert Myers10/18/12 01:29 PM
                              Supercomputer variant of Kahan quoteanon10/19/12 02:27 AM
                                Supercomputer variant of Kahan quoteRobert Myers10/19/12 07:24 AM
                                  Supercomputer variant of Kahan quoteanon10/19/12 08:00 AM
                                    Supercomputer variant of Kahan quoteRobert Myers10/19/12 09:28 AM
                                      Supercomputer variant of Kahan quoteanon10/19/12 10:27 AM
                              Supercomputer variant of Kahan quoteforestlaughing10/19/12 10:26 AM
                                Supercomputer variant of Kahan quoteRobert Myers10/19/12 07:04 PM
                                  Supercomputer variant of Kahan quoteEmil Briggs10/20/12 04:52 AM
                                    Supercomputer variant of Kahan quoteRobert Myers10/20/12 07:51 AM
                                      Supercomputer variant of Kahan quoteEmil Briggs10/20/12 08:33 AM
                                        Supercomputer variant of Kahan quoteEmil Briggs10/20/12 08:34 AM
                                          Supercomputer variant of Kahan quoteRobert Myers10/20/12 09:35 AM
                                            Supercomputer variant of Kahan quoteEmil Briggs10/20/12 10:04 AM
                                              Supercomputer variant of Kahan quoteRobert Myers10/20/12 11:23 AM
                  Latency and HPC Workloadsanon10/16/12 06:48 PM
                    Latency and HPC Workloadsforestlaughing10/19/12 11:43 AM
              Latency and HPC Workloadsforestlaughing10/19/12 09:38 AM
                Latency and HPC WorkloadsRobert Myers10/19/12 11:40 AM
                Potential false economics in researchPaul A. Clayton10/19/12 12:54 PM
                  Potential false economics in researchVincent Diepeveen10/20/12 08:59 AM
                  Potential false economics in researchforestlaughing10/23/12 10:56 AM
                    Potential false economics in researchRobert Myers10/23/12 07:16 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell blue?