By: Eric (eric.kjellen.delete@this.gmail.com), July 29, 2012 12:08 pm
Room: Moderated Discussions
aaron spink (aaronspink.delete@this.notearthlink.net) on July 29, 2012 5:39 am wrote:
> Also lets not forget that by the time this
> happens we are likely going to see some form of stacked memory in reasonably
> wide use, which means that the CPUs will likely have 1-4 GB of ultra high
> bandwidth "cache". Which if used right would provide enough bandwidth buffer to
> have the I/Os plus having the option to direct route to/from networkMIC would
> make the 102.4 GB/s CPU memory subsystem reasonable.
>
Would it be possible to have separate memory controllers (presumably there could be several controllers to provide a very wide bus, like on a GPU) for the stacked memory and map parts of the coherent (CPU) memory space to that DRAM? I.e. some range of the memory addresses could refer to those memory controllers and offer extreme bandwidth (but limited capacity) for vector processing. I'm not exactly sure how that would be handled by software and programmers, if the hardware part is viable, but with proper OS support I'm guessing that segments of the virtual memory space could be reserved for the physical memory addresses mapped to the stacked memory. As far as I can figure that solution would be compatible with both integrated LRB cores in the CPU uncore (as has been speculated to appear in Skylake) or with AVX execution units in the core.
> Also lets not forget that by the time this
> happens we are likely going to see some form of stacked memory in reasonably
> wide use, which means that the CPUs will likely have 1-4 GB of ultra high
> bandwidth "cache". Which if used right would provide enough bandwidth buffer to
> have the I/Os plus having the option to direct route to/from networkMIC would
> make the 102.4 GB/s CPU memory subsystem reasonable.
>
Would it be possible to have separate memory controllers (presumably there could be several controllers to provide a very wide bus, like on a GPU) for the stacked memory and map parts of the coherent (CPU) memory space to that DRAM? I.e. some range of the memory addresses could refer to those memory controllers and offer extreme bandwidth (but limited capacity) for vector processing. I'm not exactly sure how that would be handled by software and programmers, if the hardware part is viable, but with proper OS support I'm guessing that segments of the virtual memory space could be reserved for the physical memory addresses mapped to the stacked memory. As far as I can figure that solution would be compatible with both integrated LRB cores in the CPU uncore (as has been speculated to appear in Skylake) or with AVX execution units in the core.
Topic | Posted By | Date |
---|---|---|
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/25 12:37 AM |
New Article: Compute Efficiency 2012 | SHK | 2012/07/25 01:31 AM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/25 01:42 AM |
New Article: Compute Efficiency 2012 | none | 2012/07/25 02:18 AM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/25 10:25 AM |
GCN (NT) | EBFE | 2012/07/25 02:25 AM |
GCN - TFLOP DP | jp | 2012/08/09 12:58 PM |
GCN - TFLOP DP | David Kanter | 2012/08/09 02:32 PM |
GCN - TFLOP DP | Kevin G | 2012/08/11 04:22 PM |
GCN - TFLOP DP | Eric | 2012/08/09 04:12 PM |
GCN - TFLOP DP | jp | 2012/08/10 12:23 AM |
GCN - TFLOP DP | EBFE | 2012/08/12 07:27 PM |
GCN - TFLOP DP | jp | 2012/08/13 01:02 AM |
GCN - TFLOP DP | EBFE | 2012/08/13 06:45 PM |
GCN - TFLOP DP | jp | 2012/08/14 12:21 AM |
New Article: Compute Efficiency 2012 | Adrian | 2012/07/25 03:39 AM |
New Article: Compute Efficiency 2012 | EBFE | 2012/07/25 08:33 AM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/25 10:11 AM |
New Article: Compute Efficiency 2012 | sf | 2012/07/25 05:46 AM |
New Article: Compute Efficiency 2012 | aaron spink | 2012/07/25 08:08 AM |
New Article: Compute Efficiency 2012 | someone | 2012/07/25 09:06 AM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/25 10:14 AM |
New Article: Compute Efficiency 2012 | EBFE | 2012/07/26 01:27 AM |
BG/Q | David Kanter | 2012/07/26 08:31 AM |
VR-ZONE KNC B0 leak, poor number? | EBFE | 2012/08/03 12:57 AM |
VR-ZONE KNC B0 leak, poor number? | Eric | 2012/08/03 06:59 AM |
VR-ZONE KNC B0 leak, poor number? | EBFE | 2012/08/04 05:37 AM |
VR-ZONE KNC B0 leak, poor number? | aaron spink | 2012/08/04 05:51 PM |
Leaks != products | David Kanter | 2012/08/05 02:19 AM |
Leaks != products | EBFE | 2012/08/06 01:49 AM |
VR-ZONE KNC B0 leak, poor number? | Eric | 2012/08/05 09:37 AM |
VR-ZONE KNC B0 leak, poor number? | EBFE | 2012/08/06 02:09 AM |
VR-ZONE KNC B0 leak, poor number? | aaron spink | 2012/08/06 03:33 AM |
VR-ZONE KNC B0 leak, poor number? | jp | 2012/08/07 02:08 AM |
VR-ZONE KNC B0 leak, poor number? | Eric | 2012/08/07 03:58 AM |
VR-ZONE KNC B0 leak, poor number? | jp | 2012/08/07 04:17 AM |
VR-ZONE KNC B0 leak, poor number? | Eric | 2012/08/07 04:22 AM |
VR-ZONE KNC B0 leak, poor number? | anonymou5 | 2012/08/07 08:43 AM |
VR-ZONE KNC B0 leak, poor number? | jp | 2012/08/07 04:23 AM |
VR-ZONE KNC B0 leak, poor number? | aaron spink | 2012/08/07 06:24 AM |
VR-ZONE KNC B0 leak, poor number? | aaron spink | 2012/08/07 06:20 AM |
VR-ZONE KNC B0 leak, poor number? | jp | 2012/08/07 10:22 AM |
VR-ZONE KNC B0 leak, poor number? | EduardoS | 2012/08/07 02:15 PM |
KNC has FMA | David Kanter | 2012/08/07 08:17 AM |
New Article: Compute Efficiency 2012 | forestlaughing | 2012/07/25 07:51 AM |
New Article: Compute Efficiency 2012 | Eric | 2012/07/27 04:12 AM |
New Article: Compute Efficiency 2012 | hobold | 2012/07/27 10:53 AM |
New Article: Compute Efficiency 2012 | Eric | 2012/07/27 11:51 AM |
New Article: Compute Efficiency 2012 | hobold | 2012/07/27 01:48 PM |
New Article: Compute Efficiency 2012 | Eric | 2012/07/27 02:29 PM |
New Article: Compute Efficiency 2012 | anon | 2012/07/29 01:25 AM |
New Article: Compute Efficiency 2012 | hobold | 2012/07/29 10:53 AM |
Efficiency? No, lack of highly useful features | someone | 2012/07/25 08:58 AM |
Best case for GPUs | David Kanter | 2012/07/25 10:28 AM |
Best case for GPUs | franzliszt | 2012/07/25 12:39 PM |
Best case for GPUs | Chuck | 2012/07/25 07:13 PM |
Best case for GPUs | David Kanter | 2012/07/25 08:45 PM |
Best case for GPUs | Eric | 2012/07/27 04:51 AM |
Silverthorn data point | Michael S | 2012/07/25 01:45 PM |
Silverthorn data point | David Kanter | 2012/07/25 03:06 PM |
New Article: Compute Efficiency 2012 | Unununium | 2012/07/25 04:55 PM |
New Article: Compute Efficiency 2012 | EduardoS | 2012/07/25 07:12 PM |
Ops... I'm wrong... | EduardoS | 2012/07/25 07:14 PM |
New Article: Compute Efficiency 2012 | TacoBell | 2012/07/25 07:36 PM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/25 08:49 PM |
New Article: Compute Efficiency 2012 | Michael S | 2012/07/26 01:33 AM |
Line and factor | Moritz | 2012/07/26 12:34 AM |
Line and factor | Peter Boyle | 2012/07/27 06:57 AM |
not entirely | Moritz | 2012/07/27 11:22 AM |
Line and factor | EduardoS | 2012/07/27 04:24 PM |
Line and factor | Moritz | 2012/07/28 11:52 AM |
tables | Michael S | 2012/07/26 01:39 AM |
Interlagos L2+L3 | Rana | 2012/07/26 02:13 AM |
Interlagos L2+L3 | Rana | 2012/07/26 02:13 AM |
Interlagos L2+L3 | David Kanter | 2012/07/26 08:21 AM |
SP vs DP & performance metrics | jp | 2012/07/27 06:08 AM |
SP vs DP & performance metrics | Eric | 2012/07/27 06:57 AM |
SP vs DP & performance metrics | jp | 2012/07/27 08:18 AM |
SP vs DP & performance metrics | aaron spink | 2012/07/27 08:36 AM |
SP vs DP & performance metrics | jp | 2012/07/27 08:47 AM |
"Global" --> system | Paul A. Clayton | 2012/07/27 09:31 AM |
"Global" --> system | jp | 2012/07/27 02:55 PM |
"Global" --> system | aaron spink | 2012/07/27 06:33 PM |
"Global" --> system | jp | 2012/07/28 01:00 AM |
"Global" --> system | aaron spink | 2012/07/28 05:54 AM |
"Global" --> system | jp | 2012/07/29 01:12 AM |
"Global" --> system | aaron spink | 2012/07/29 04:03 AM |
"Global" --> system | none | 2012/07/29 08:05 AM |
"Global" --> system | EduardoS | 2012/07/29 09:26 AM |
"Global" --> system | jp | 2012/07/30 01:24 AM |
"Global" --> system | aaron spink | 2012/07/30 02:05 AM |
"Global" --> system | aaron spink | 2012/07/30 02:03 AM |
daxpy is STREAM TRIAD | Paul A. Clayton | 2012/07/30 05:10 AM |
SP vs DP & performance metrics | aaron spink | 2012/07/27 06:25 PM |
SP vs DP & performance metrics | Emil Briggs | 2012/07/28 05:40 AM |
SP vs DP & performance metrics | aaron spink | 2012/07/28 06:05 AM |
SP vs DP & performance metrics | jp | 2012/07/28 10:04 AM |
SP vs DP & performance metrics | Brett | 2012/07/28 02:32 PM |
SP vs DP & performance metrics | Emil Briggs | 2012/07/28 05:11 PM |
SP vs DP & performance metrics | anon | 2012/07/29 01:53 AM |
SP vs DP & performance metrics | aaron spink | 2012/07/29 04:39 AM |
Coherency for discretes | Rohit | 2012/07/29 08:24 AM |
SP vs DP & performance metrics | anon | 2012/07/29 10:09 AM |
SP vs DP & performance metrics | Eric | 2012/07/29 12:08 PM |
SP vs DP & performance metrics | aaron spink | 2012/07/27 08:25 AM |
Regular updates? | Joe | 2012/07/27 08:35 AM |
New Article: Compute Efficiency 2012 | 309 | 2012/07/27 09:34 PM |
New Article: Compute Efficiency 2012 | Ingeneer | 2012/07/30 08:01 AM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/30 12:11 PM |
New Article: Compute Efficiency 2012 | Ingeneer | 2012/07/30 07:04 PM |
New Article: Compute Efficiency 2012 | David Kanter | 2012/07/30 08:32 PM |
Memory power and bandwidth? | Iain McClatchie | 2012/08/03 03:35 PM |
Memory power and bandwidth? | David Kanter | 2012/08/04 10:22 AM |
Memory power and bandwidth? | Michael S | 2012/08/04 01:36 PM |
Memory power and bandwidth? | Iain McClatchie | 2012/08/06 01:09 PM |
Memory power and bandwidth? | Eric | 2012/08/07 05:28 PM |
Workloads | David Kanter | 2012/08/08 09:49 AM |
Workloads | Eric | 2012/08/09 04:21 PM |
Latency and bandwidth bottlenecks | Paul A. Clayton | 2012/08/08 03:02 PM |
Latency and bandwidth bottlenecks | Eric | 2012/08/09 04:32 PM |
Latency and bandwidth bottlenecks | none | 2012/08/10 05:06 AM |
Latency and bandwidth bottlenecks -> BDP | ajensen | 2012/08/11 02:21 PM |
Memory power and bandwidth? | Ingeneer | 2012/08/06 10:26 AM |
NV aims for 1.8+ TFLOPS DP ? | jp | 2012/08/11 12:21 PM |
NV aims for 1.8+ TFLOPS DP ? | David Kanter | 2012/08/11 08:25 PM |
NV aims for 1.8+ TFLOPS DP ? | jp | 2012/08/12 01:45 AM |
NV aims for 1.8+ TFLOPS DP ? | EBFE | 2012/08/12 09:02 PM |
NV aims for 1.8+ TFLOPS DP ? | jp | 2012/08/13 12:54 AM |
NV aims for 1.8+ TFLOPS DP ? | Gabriele Svelto | 2012/08/13 08:16 AM |
NV aims for 1.8+ TFLOPS DP ? | Vincent Diepeveen | 2012/08/14 02:04 AM |
NV aims for 1.8+ TFLOPS DP ? | David Kanter | 2012/08/13 08:50 AM |
NV aims for 1.8+ TFLOPS DP ? | jp | 2012/08/13 10:17 AM |
NV aims for 1.8+ TFLOPS DP ? | EduardoS | 2012/08/13 05:45 AM |