By: Iain McClatchie (iain-rwt.delete@this.mcclatchie.com), August 3, 2012 4:35 pm
Room: Moderated Discussions
One of the big differences between CPUs and GPUs to me is their physical memory architecture.
CPU physical memory architecture:
CPUs come in an FBGA which you mount onto a motherboard with a nonsoldered really expensive socket. The DRAM for this system comes in FBGAs which are soldered to DIMMs, which then connect to the motherboard via the DIMM socket. It's usually possible to load two DIMMs per memory channel, and the CPU provides 1 clock pair per 8 DQs, and the CPU knows how to deal with registers between the CPU outputs and the DRAM chips. The pin data rate is something like 1 Gb/s/pin.
This is good for configuring the system memory after the motherboard has been soldered together. This is bad for memory power dissipation (DQs are actively terminated and terminating 2 DRAM drops per CPU DQ pin consumes really large amounts of power).
GPU physical memory architecture:
GPU comes in an FBGA which is soldered to the same board as the DRAM FBGAs. DQs are point-to-point with just two solder balls near the ends of the line. GPU provides 1 clock pair per 16 DQs. The pin data rate is something like 4 Gb/s/pin.
This is good for high bandwidth and low power, but it means you configure the memory when you solder everything down.
My proposal:
For many years now, it has seemed to me that CPUs should be sold as GPUs are sold, soldered onto little boards with their DRAM, with one-to-one data pins between CPU and DRAM. Any given CPU core/speed might be offered with 2 different memory loads. For example, you might be able to buy a 3 GHz, 4 core CPU with 8 GB or 16 GB of DRAM as a unit. This would double the number of SKUs shipped by CPU board manufacturers. The CPU board would plug into the motherboard as GPUs do now, and it's conceivable that you might be able to select between plugging in CPUs and GPUs.
A 16 GB load of 2 Gb chips is 72 DRAM chips, which can be implemented one-to-one with x8 DRAMs and 576 data pins. Obviously some of the lines will have to be somewhat long (7-8 cm?), but I don't think that requires active termination. 32 GB/CPU package and larger configs would require x4 chips and buffering, and perhaps chip stacking for the really large memory loads.
My guess is that GPUs (and their memories) burn much less IO power per data data bandwidth than CPUs. This proposal would bring CPUs up to par, and eliminate most of the expensive CPU and DIMM connections in the system, increasing system reliability and decreasing cost.
From a business point-of-view, the combined product encapsulates quite a bit more of the high-cost portion of the system. It would lead to a big shakeup as DIMM and motherboard manufacturers duke it out to see who ends up being good at shipping a high-cost commodity with price-volatile components on it.
CPU physical memory architecture:
CPUs come in an FBGA which you mount onto a motherboard with a nonsoldered really expensive socket. The DRAM for this system comes in FBGAs which are soldered to DIMMs, which then connect to the motherboard via the DIMM socket. It's usually possible to load two DIMMs per memory channel, and the CPU provides 1 clock pair per 8 DQs, and the CPU knows how to deal with registers between the CPU outputs and the DRAM chips. The pin data rate is something like 1 Gb/s/pin.
This is good for configuring the system memory after the motherboard has been soldered together. This is bad for memory power dissipation (DQs are actively terminated and terminating 2 DRAM drops per CPU DQ pin consumes really large amounts of power).
GPU physical memory architecture:
GPU comes in an FBGA which is soldered to the same board as the DRAM FBGAs. DQs are point-to-point with just two solder balls near the ends of the line. GPU provides 1 clock pair per 16 DQs. The pin data rate is something like 4 Gb/s/pin.
This is good for high bandwidth and low power, but it means you configure the memory when you solder everything down.
My proposal:
For many years now, it has seemed to me that CPUs should be sold as GPUs are sold, soldered onto little boards with their DRAM, with one-to-one data pins between CPU and DRAM. Any given CPU core/speed might be offered with 2 different memory loads. For example, you might be able to buy a 3 GHz, 4 core CPU with 8 GB or 16 GB of DRAM as a unit. This would double the number of SKUs shipped by CPU board manufacturers. The CPU board would plug into the motherboard as GPUs do now, and it's conceivable that you might be able to select between plugging in CPUs and GPUs.
A 16 GB load of 2 Gb chips is 72 DRAM chips, which can be implemented one-to-one with x8 DRAMs and 576 data pins. Obviously some of the lines will have to be somewhat long (7-8 cm?), but I don't think that requires active termination. 32 GB/CPU package and larger configs would require x4 chips and buffering, and perhaps chip stacking for the really large memory loads.
My guess is that GPUs (and their memories) burn much less IO power per data data bandwidth than CPUs. This proposal would bring CPUs up to par, and eliminate most of the expensive CPU and DIMM connections in the system, increasing system reliability and decreasing cost.
From a business point-of-view, the combined product encapsulates quite a bit more of the high-cost portion of the system. It would lead to a big shakeup as DIMM and motherboard manufacturers duke it out to see who ends up being good at shipping a high-cost commodity with price-volatile components on it.



