By: Michael S (already5chosen.delete@this.yahoo.com), September 23, 2007 2:06 am
Room: Moderated Discussions
David Kanter (dkanter@realworldtech.com) on 9/22/07 wrote:
---------------------------
>>>>Small question - how come they didn't leverage the PCI >Express and needed a new
>>>>bus (excuse me, p2p interconnect) altogether?
>>>
>>>PCI Express isn't coherent, it's also fairly high latency since it uses 8B/10B clock encoding.
>>
>>How much latency does 8B/10B encoding contribute? If I am >reading this correctly,
>>Lattice has a programmable logic implementation optimized >for throughput with only
>>2 clocks of latency on the encoder and 3 clocks on the >decoder when working exclusively with serial bit streams:
>>
>>http://www.latticesemi.com/dynamic/view_document.cfm?document_id=5653
>
>So I think the issue is that you'd need at least 10 data transfers to occur before
>you can get usable data extracted from the symbols.
>
>CSI flits are typically sent in 4 data transfers, so it would more than double
>the latency (on top of whatever encode/decode latency exists) to simply forward a flit.
>
>David
>
Nah. Algorithmic delay of 8b/10b decoding is equal to 10T regardless of the size of packet. At CSI data rates 10T=1.5ns=lost in noise. For example, algorithmic delay of interlane descewing falls in the same range but nobody sees it as a problem.
Of course, there is implementation delay apart from algorithmic delay but the former tend to improve with design generations.
According to my understanding the real reason for not going with PCIe-on-steroids phy is power rather than delay.
Current 2.5 GT/s PCIe implementations consume 12-15 mW*s/Gbit. If you try to use the same technology in 6 GT/s range it would cost over 20 mW*s/Gbit. Narrow source-synchronous parallel link with descewing consumes significantly less power per bit esp. if it doesn't use de-emphasis.
So Intel decided to play it safe.
Was it a wise decisions? Up until few months ago I'd say yes. But recently Rambus announced a breakthrough in serdes power efficiency - order of 1 mW*s/Gbit at data rates approaching 5 GT/s. So unless Rambus developers are missing something important Intel's decision to go parallel looks not so wise at the end.
---------------------------
>>>>Small question - how come they didn't leverage the PCI >Express and needed a new
>>>>bus (excuse me, p2p interconnect) altogether?
>>>
>>>PCI Express isn't coherent, it's also fairly high latency since it uses 8B/10B clock encoding.
>>
>>How much latency does 8B/10B encoding contribute? If I am >reading this correctly,
>>Lattice has a programmable logic implementation optimized >for throughput with only
>>2 clocks of latency on the encoder and 3 clocks on the >decoder when working exclusively with serial bit streams:
>>
>>http://www.latticesemi.com/dynamic/view_document.cfm?document_id=5653
>
>So I think the issue is that you'd need at least 10 data transfers to occur before
>you can get usable data extracted from the symbols.
>
>CSI flits are typically sent in 4 data transfers, so it would more than double
>the latency (on top of whatever encode/decode latency exists) to simply forward a flit.
>
>David
>
Nah. Algorithmic delay of 8b/10b decoding is equal to 10T regardless of the size of packet. At CSI data rates 10T=1.5ns=lost in noise. For example, algorithmic delay of interlane descewing falls in the same range but nobody sees it as a problem.
Of course, there is implementation delay apart from algorithmic delay but the former tend to improve with design generations.
According to my understanding the real reason for not going with PCIe-on-steroids phy is power rather than delay.
Current 2.5 GT/s PCIe implementations consume 12-15 mW*s/Gbit. If you try to use the same technology in 6 GT/s range it would cost over 20 mW*s/Gbit. Narrow source-synchronous parallel link with descewing consumes significantly less power per bit esp. if it doesn't use de-emphasis.
So Intel decided to play it safe.
Was it a wise decisions? Up until few months ago I'd say yes. But recently Rambus announced a breakthrough in serdes power efficiency - order of 1 mW*s/Gbit at data rates approaching 5 GT/s. So unless Rambus developers are missing something important Intel's decision to go parallel looks not so wise at the end.