By: Doug S (foo.delete@this.bar.bar), January 30, 2013 12:38 pm
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on January 30, 2013 12:12 am wrote:
> Incidentally, that's one area where most ARM guys are far behind. If you look at
> the work that Intel, AMD, etc. have to do to get multiple DIMMs working for servers,
> it's incredible. The phone guys totally punt on that (and for good reason).
>
> It'll be interesting to see how Samsung's efforts turn out, in theory
> they should have a big advantage when it comes to memory controllers.
Doesn't DDR4 mandate only one DIMM per channel? I thought it expanded the number of ranks (stacking) but the tradeoff was the single DIMM, unless you use some type of on-board switch. I don't know nearly enough about DRAM to know what sort of tradeoff there is between the complexity of multiple DIMMs per channel and the complexity of more ranks per DIMM, not to mention a switch. I mention DDR4 because it probably makes more sense to use a DDR4 controller in 64 bit ARM SoC designs unless they're targeted at getting on the market in the next 24 months (in which case they probably need to tape out this summer)
Another alternative for SoC vendors would be soldering the memory to the board. That would increase memory performance, with the tradeoff being inability to upgrade/repair and having to maintain several SKUs (you wouldn't need dozens, you'd just offer 3-5 power of two upgrades from very small to very large) Possibly the CPU could be soldered as well, though that would multiply the SKUs by the number of speed/core variations. If there were only a few it might work since you wouldn't need to offer every possible combination of CPU/memory. Before you dismiss the idea of soldering the memory as impractical, your article title does suggest specialization - for some applications such as cloud computing or web serving, these limitations may be worth the tradeoff. It might make repair impossible, but it would make some types of failure less likely.
The server wouldn't need all that much in the way of smarts to keep a map in flash of ECC error locations (both soft and hard) that the bootstrap and OS could access to automatically map out bad areas, bad chips (too many ECC errors or complete failure) and bad CPUs, so that a server could continue to run in a slightly degraded fashion.
> Incidentally, that's one area where most ARM guys are far behind. If you look at
> the work that Intel, AMD, etc. have to do to get multiple DIMMs working for servers,
> it's incredible. The phone guys totally punt on that (and for good reason).
>
> It'll be interesting to see how Samsung's efforts turn out, in theory
> they should have a big advantage when it comes to memory controllers.
Doesn't DDR4 mandate only one DIMM per channel? I thought it expanded the number of ranks (stacking) but the tradeoff was the single DIMM, unless you use some type of on-board switch. I don't know nearly enough about DRAM to know what sort of tradeoff there is between the complexity of multiple DIMMs per channel and the complexity of more ranks per DIMM, not to mention a switch. I mention DDR4 because it probably makes more sense to use a DDR4 controller in 64 bit ARM SoC designs unless they're targeted at getting on the market in the next 24 months (in which case they probably need to tape out this summer)
Another alternative for SoC vendors would be soldering the memory to the board. That would increase memory performance, with the tradeoff being inability to upgrade/repair and having to maintain several SKUs (you wouldn't need dozens, you'd just offer 3-5 power of two upgrades from very small to very large) Possibly the CPU could be soldered as well, though that would multiply the SKUs by the number of speed/core variations. If there were only a few it might work since you wouldn't need to offer every possible combination of CPU/memory. Before you dismiss the idea of soldering the memory as impractical, your article title does suggest specialization - for some applications such as cloud computing or web serving, these limitations may be worth the tradeoff. It might make repair impossible, but it would make some types of failure less likely.
The server wouldn't need all that much in the way of smarts to keep a map in flash of ECC error locations (both soft and hard) that the bootstrap and OS could access to automatically map out bad areas, bad chips (too many ECC errors or complete failure) and bad CPUs, so that a server could continue to run in a slightly degraded fashion.