By: Maynard Handley (name99.delete@this.name99.org), August 27, 2014 1:34 pm
Room: Moderated Discussions
I like a good fact-free fight about how x86 [sux|roolz] as much as anyone, but how about we turn aside from that for a few minutes to consider one particular set of datapoints?
To recap:
The argument made (enthusiastically by juanrga, more moderately by others) is that the ARM-64 server CPUs which we should see from various vendors will do well because they will offer a compelling performance/power advantage over their x64 competitors.
The argument made by me is a variant of this which places substantially more stress on business issues --- the costs of the CPUs/SOCs to design and then manufacture, the costs for which they can be and are sold, the worry that (so far negatively profitable) Atoms will cannibalize Intel's higher end if they are improved too much.
With this background, there is a fairly large review of NAS units at AnandTech today:
http://www.anandtech.com/show/8404/seagate-intel-rangeley-nas-pro-4bay-review
This is interesting to me in the context of this argument because my argument has long been that ARM will get its start in servers at the low-end, in things that many don't want to call servers, like NAS units.
The contenders are basically
QNAP
Seagate NAS Pro
Synology
all modern Atom based
Asustor
old gen (Bonnell) Atom based
Lenovo
Seagate NAS
WD
all ARMv7 based, various Marvell SoCs
Looking at the benchmarks the results are all over the map. The modern Atom designs are pretty consistently (though not uniformly) the winners, WD is pretty uniformly the loser. Lenovo seems about the best of the Marvell designs.
The Seagate NAS Pro costs around $500 (but for some reason is $550 at Amazon?)
The Lenovo is (supposedly) $270. The Seagate NAS i$300 (but again $360 at Amazon).
The messages *I* would take away from all this is that:
- Today's Atom offers high performance (at least for this task) than today's ARM contender in this space. I don't think anyone is arguing otherwise.
- Today's Atom costs you around $200 more/1.5x to 2x more for that extra performance (which, depending on your needs, may well be worth paying)
- If we're willing to look beyond today (something which some people apparently consider unacceptable, except when to talk about Skylake or Cannonlake or Broadwell-EP!) one would expect next year's ARM SOCs in this space to match at least today's Atom performance at probably the same price as today --- maybe the devices will bump up in price by $20 or so. This is of course not guaranteed, either performance or price, but seems a perfectly reasonable extrapolation.
Today's Atom also much the same power. (Anand quotes "Comparing these numbers with that of the other four-bay NAS units, we find that the power consumption is actually lower than that of even ARM-based units such as the LenovoEMC ix4-300d and Western Digital EX4." but I thought a more apples to apples comparison was to compare with the Seagate NAS and the numbers are pretty much identical for both platforms.) Presumably at least some of IO that's off chip on at least some of the ARM SoCs will move on-chip next year as part of the usual march of progress, which will help keep ARM power competitive.
- Substantially more controversial is the question of what next year's Atom for this space will look like. In the one corner we have claims of a device manufactured on 14nm, perhaps boosted to 4 cores but without a bump in price, and heck, why not thrown in that the memory controller, uncore and the entire IO system are transplanted from Xeon and can kick anyone else's ass?
In the other corner we have a history of only minor improvements in Atom between seriously new micro-architectures (and one of those is not expected next year), and the business logic that would make it undesirable for Intel to move low-cost Atom to 14nm too fast, or to kit it out with too performant an uncore.
So, if you want to argue about about how this plays out this time next year, that's where I'd concentrate my fire --- on how improved Atom will/will not be in twelve months.
- Confusing the whole issue (but a fact of life) is the substantial disparity in performance between what you'd expect to be much the same SoCs. The ARM cores all come from Marvell, but the details differ: the WD device is single core at 2GHz, Lenovo is dual core at 1.3GHz, Seagate NAS is single core at 1.2GHz. I didn't bother to learn which particular ARMv7 CPU each uses. The uncores and IO are probably also fairly different...
There's likely also some tweaking that each vendor has done to the underlying Linux/(SAMBA?) code base to boost some operations over others (at least that's my interpretation of why we get so much rearrangement in the ordering of the devices between different benchmarks).
But I don't see anything here that suggests that native x86 is an insurmountable advantage through, eg, better gcc, Linux, SMB support. I'm sure the underlying code bases are far more tweaked for x86, and that that tweaking is worth maybe 10% or so, but ARMv7 appears to do about as well as I'd expect even with that advantage.
Next year, with newer designs, ARMv8, and an additional year of work by Linaro and LLVM, I expect that advantage to have shrunk. (For example, depending on how fast Linaro get things working, the next wave of devices may be able to run on 16kiB pages, and just that by itself may be worth 5% or so. I've mentioned before that Apple told developers last year to prepare for a switch to 16kiB pages on ARMv8. If they switch that on for iOS8, it will be interesting to see how much that, by itself, speeds up otherwise unmodified apps.)
To recap:
The argument made (enthusiastically by juanrga, more moderately by others) is that the ARM-64 server CPUs which we should see from various vendors will do well because they will offer a compelling performance/power advantage over their x64 competitors.
The argument made by me is a variant of this which places substantially more stress on business issues --- the costs of the CPUs/SOCs to design and then manufacture, the costs for which they can be and are sold, the worry that (so far negatively profitable) Atoms will cannibalize Intel's higher end if they are improved too much.
With this background, there is a fairly large review of NAS units at AnandTech today:
http://www.anandtech.com/show/8404/seagate-intel-rangeley-nas-pro-4bay-review
This is interesting to me in the context of this argument because my argument has long been that ARM will get its start in servers at the low-end, in things that many don't want to call servers, like NAS units.
The contenders are basically
QNAP
Seagate NAS Pro
Synology
all modern Atom based
Asustor
old gen (Bonnell) Atom based
Lenovo
Seagate NAS
WD
all ARMv7 based, various Marvell SoCs
Looking at the benchmarks the results are all over the map. The modern Atom designs are pretty consistently (though not uniformly) the winners, WD is pretty uniformly the loser. Lenovo seems about the best of the Marvell designs.
The Seagate NAS Pro costs around $500 (but for some reason is $550 at Amazon?)
The Lenovo is (supposedly) $270. The Seagate NAS i$300 (but again $360 at Amazon).
The messages *I* would take away from all this is that:
- Today's Atom offers high performance (at least for this task) than today's ARM contender in this space. I don't think anyone is arguing otherwise.
- Today's Atom costs you around $200 more/1.5x to 2x more for that extra performance (which, depending on your needs, may well be worth paying)
- If we're willing to look beyond today (something which some people apparently consider unacceptable, except when to talk about Skylake or Cannonlake or Broadwell-EP!) one would expect next year's ARM SOCs in this space to match at least today's Atom performance at probably the same price as today --- maybe the devices will bump up in price by $20 or so. This is of course not guaranteed, either performance or price, but seems a perfectly reasonable extrapolation.
Today's Atom also much the same power. (Anand quotes "Comparing these numbers with that of the other four-bay NAS units, we find that the power consumption is actually lower than that of even ARM-based units such as the LenovoEMC ix4-300d and Western Digital EX4." but I thought a more apples to apples comparison was to compare with the Seagate NAS and the numbers are pretty much identical for both platforms.) Presumably at least some of IO that's off chip on at least some of the ARM SoCs will move on-chip next year as part of the usual march of progress, which will help keep ARM power competitive.
- Substantially more controversial is the question of what next year's Atom for this space will look like. In the one corner we have claims of a device manufactured on 14nm, perhaps boosted to 4 cores but without a bump in price, and heck, why not thrown in that the memory controller, uncore and the entire IO system are transplanted from Xeon and can kick anyone else's ass?
In the other corner we have a history of only minor improvements in Atom between seriously new micro-architectures (and one of those is not expected next year), and the business logic that would make it undesirable for Intel to move low-cost Atom to 14nm too fast, or to kit it out with too performant an uncore.
So, if you want to argue about about how this plays out this time next year, that's where I'd concentrate my fire --- on how improved Atom will/will not be in twelve months.
- Confusing the whole issue (but a fact of life) is the substantial disparity in performance between what you'd expect to be much the same SoCs. The ARM cores all come from Marvell, but the details differ: the WD device is single core at 2GHz, Lenovo is dual core at 1.3GHz, Seagate NAS is single core at 1.2GHz. I didn't bother to learn which particular ARMv7 CPU each uses. The uncores and IO are probably also fairly different...
There's likely also some tweaking that each vendor has done to the underlying Linux/(SAMBA?) code base to boost some operations over others (at least that's my interpretation of why we get so much rearrangement in the ordering of the devices between different benchmarks).
But I don't see anything here that suggests that native x86 is an insurmountable advantage through, eg, better gcc, Linux, SMB support. I'm sure the underlying code bases are far more tweaked for x86, and that that tweaking is worth maybe 10% or so, but ARMv7 appears to do about as well as I'd expect even with that advantage.
Next year, with newer designs, ARMv8, and an additional year of work by Linaro and LLVM, I expect that advantage to have shrunk. (For example, depending on how fast Linaro get things working, the next wave of devices may be able to run on 16kiB pages, and just that by itself may be worth 5% or so. I've mentioned before that Apple told developers last year to prepare for a switch to 16kiB pages on ARMv8. If they switch that on for iOS8, it will be interesting to see how much that, by itself, speeds up otherwise unmodified apps.)