By: Ted J. (ted.delete@this.no.spam), July 8, 2013 4:06 pm
Room: Moderated Discussions
Varun wrote:
> do you see Intel using this technology in their Xeon line to compete with the massive LLCs on Oracle and IBM parts?
The current price of a 2.7 GHz Sandy Bridge-E with 20 MBytes of L3 cache is $1700 (dual socket version) and $3600 (quad socket version). Adding 2 Crystalwell chips to each socket of the dual socket Haswell-E and 4 Crystalwell chips to each socket of the quad socket Haswell-E would increase the processor price by around 10%. I think even a 3% performance increase could justify this price increase because of the cost of DRAM DIMMs, SSDs, hard drives, electricity, software and datacenter space.
Some applications wouldn't see any benefit and other applications would see a significant benefit so there should probably be versions with and without L4 cache. My guess is workstations, supercomputers, memcached servers and database servers would use the version with L4 cache. Front-end presentation layer servers would use the version without L4 cache. The quad socket Haswell-E is for database servers. These systems are not sensitive to processor price and a big L4 cache makes sense there.
Using numbers from an article on AnandTech, which you can get by searching for "Intel Iris Pro 5200 Graphics Review: Core i7-4950HQ Tested" or from the link below,
http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested
each Crystalwell chip consumes 3.5 to 4.5W operating at full bandwidth and .5 to 1W at idle refreshing the data stored inside. Each Crystalwell chip holds 128 MBytes and has a die size of 7mm x 12mm (84 mm^2) on 22nm. Two to four Crystalwell chips are capable of providing 100 to 200 GBytes/sec of L4 bandwidth in each direction (200 to 400 GBytes/sec total) with half the latency of the DRAM DIMMs. These bandwidth numbers are the total for all the Crystalwell chips. The processor may not have enough bumps to achieve these numbers. I would guess the bus width on Crystalwell is programmable so different numbers of Crystalwell chips can be connected to the same processor without changing the number of bumps on the processor.
Intel doesn't have any announced versions of Haswell that are identical in all respects except for the L4 cache. A rough idea of the price adder for a single Crystalwell chip can be had by comparing the price of the i7-4750HQ ($440) with the i7-4702HQ ($383). Both these products have 4 cores, 8 threads, 6 MBytes of L3 cache and a single-core turbo frequency of 3.2 GHz. The 4750 has GT3e graphics and the 4702 has GT2 graphics.
My guess is the dual socket Haswell-E priced above $1000 will have 2 Crystalwell chips per socket. The dual socket Haswell-E priced between $500 and $1000 will have one Crystalwell chip per socket. The quad socket Haswell-E priced above $2000 will have 4 Crystalwell chips per socket. The quad socket Haswell-E priced between $1000 and $2000 will have 2 Crystalwell chips per socket. I have no inside information. These guesses are just my opinion of what would make sense.
> do you see Intel using this technology in their Xeon line to compete with the massive LLCs on Oracle and IBM parts?
The current price of a 2.7 GHz Sandy Bridge-E with 20 MBytes of L3 cache is $1700 (dual socket version) and $3600 (quad socket version). Adding 2 Crystalwell chips to each socket of the dual socket Haswell-E and 4 Crystalwell chips to each socket of the quad socket Haswell-E would increase the processor price by around 10%. I think even a 3% performance increase could justify this price increase because of the cost of DRAM DIMMs, SSDs, hard drives, electricity, software and datacenter space.
Some applications wouldn't see any benefit and other applications would see a significant benefit so there should probably be versions with and without L4 cache. My guess is workstations, supercomputers, memcached servers and database servers would use the version with L4 cache. Front-end presentation layer servers would use the version without L4 cache. The quad socket Haswell-E is for database servers. These systems are not sensitive to processor price and a big L4 cache makes sense there.
Using numbers from an article on AnandTech, which you can get by searching for "Intel Iris Pro 5200 Graphics Review: Core i7-4950HQ Tested" or from the link below,
http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested
each Crystalwell chip consumes 3.5 to 4.5W operating at full bandwidth and .5 to 1W at idle refreshing the data stored inside. Each Crystalwell chip holds 128 MBytes and has a die size of 7mm x 12mm (84 mm^2) on 22nm. Two to four Crystalwell chips are capable of providing 100 to 200 GBytes/sec of L4 bandwidth in each direction (200 to 400 GBytes/sec total) with half the latency of the DRAM DIMMs. These bandwidth numbers are the total for all the Crystalwell chips. The processor may not have enough bumps to achieve these numbers. I would guess the bus width on Crystalwell is programmable so different numbers of Crystalwell chips can be connected to the same processor without changing the number of bumps on the processor.
Intel doesn't have any announced versions of Haswell that are identical in all respects except for the L4 cache. A rough idea of the price adder for a single Crystalwell chip can be had by comparing the price of the i7-4750HQ ($440) with the i7-4702HQ ($383). Both these products have 4 cores, 8 threads, 6 MBytes of L3 cache and a single-core turbo frequency of 3.2 GHz. The 4750 has GT3e graphics and the 4702 has GT2 graphics.
My guess is the dual socket Haswell-E priced above $1000 will have 2 Crystalwell chips per socket. The dual socket Haswell-E priced between $500 and $1000 will have one Crystalwell chip per socket. The quad socket Haswell-E priced above $2000 will have 4 Crystalwell chips per socket. The quad socket Haswell-E priced between $1000 and $2000 will have 2 Crystalwell chips per socket. I have no inside information. These guesses are just my opinion of what would make sense.