By: hcl64 (mario.smarq.delete@this.gmail.com), April 28, 2012 2:44 pm
Room: Moderated Discussions
David Kanter (dkanter@realworldtech.com) on 4/28/12 wrote:
---------------------------
>hcl64 (mario.smarq@gmail.com) on 4/27/12 wrote:
>---------------------------
>>David Kanter (dkanter@realworldtech.com) on 4/20/12 wrote:
>>---------------------------
>>>
>>>You are correct that the caches are painfully slow, but that's not the reason why.
>>>Frankly, I don't understand why the L1 is 4 cycles instead of 3. I REALLY don't
>>>understand why the L2 cache is so slow (20 cycles, really??), because size alone
>>>doesn't account for it. 12-14 cycles sounds much more reasonable.
>>>
>>>The L3 cache is also quite slow, in part because of the slow L2 and in part because
>>>it runs at asynchronous to the cores. If you look at those two factors together
>>>and assume a 14 cycle L2, you can probably cut the L3 latency down by ~10 cycles.
>>>
>>>
>>>DK
>
>>Good points. But isn't BD mostly asynchronous or semi->synchronous ?
>
>No. You should read my article about Bulldozer.
>
>http://www.realworldtech.com/page.cfm?ArticleID=RWT082610181333
>
>Decoupled and asynchronous mean very different things. Asynchronous refers to frequency.
>Decoupled means there are buffers between stages. Those are two very different
>concepts and have very different implications for design.
>
>DK
Well its a point, but it fits my understanding of "asynchronous"... if as example, the "decode domain" is crunching thread A, which is stalled at core0 in the "execution domain", but the core1 is executing thread B at the same exec domain... then the "decode domain" and the "exec domain" are asynchronous. And the same can be said for most other domains, "Decode" at thread A and "Dispatch" at thread B as example
http://en.wikipedia.org/wiki/Asynchrony
" In specific terms of digital logic and physical layer of communication, an *asynchronous process does not require a clock signal*, in contrast with synchronous and plesiochronous systems."
"At the higher data link layer of communication, asynchrony is synonym of statistical multiplexing, such as in packet mode. The information transmission may or may not start immediately as requested by sender, the additional delay being caused by medium congestion. Contrast with example of circuit switched communication, which (once circuit is established) allows immediate start of transfer with a guaranteed bit rate. Confusingly, a communication is often synchronous at the physical layer, while being asynchronous at the data link layer."
**" In programming, asynchronous events are those occurring independently of the main program flow. Asynchronous actions are actions executed in a non-blocking scheme, allowing the main program flow to continue processing."**
---------------------------
>hcl64 (mario.smarq@gmail.com) on 4/27/12 wrote:
>---------------------------
>>David Kanter (dkanter@realworldtech.com) on 4/20/12 wrote:
>>---------------------------
>>>
>>>You are correct that the caches are painfully slow, but that's not the reason why.
>>>Frankly, I don't understand why the L1 is 4 cycles instead of 3. I REALLY don't
>>>understand why the L2 cache is so slow (20 cycles, really??), because size alone
>>>doesn't account for it. 12-14 cycles sounds much more reasonable.
>>>
>>>The L3 cache is also quite slow, in part because of the slow L2 and in part because
>>>it runs at asynchronous to the cores. If you look at those two factors together
>>>and assume a 14 cycle L2, you can probably cut the L3 latency down by ~10 cycles.
>>>
>>>
>>>DK
>
>>Good points. But isn't BD mostly asynchronous or semi->synchronous ?
>
>No. You should read my article about Bulldozer.
>
>http://www.realworldtech.com/page.cfm?ArticleID=RWT082610181333
>
>Decoupled and asynchronous mean very different things. Asynchronous refers to frequency.
>Decoupled means there are buffers between stages. Those are two very different
>concepts and have very different implications for design.
>
>DK
Well its a point, but it fits my understanding of "asynchronous"... if as example, the "decode domain" is crunching thread A, which is stalled at core0 in the "execution domain", but the core1 is executing thread B at the same exec domain... then the "decode domain" and the "exec domain" are asynchronous. And the same can be said for most other domains, "Decode" at thread A and "Dispatch" at thread B as example
http://en.wikipedia.org/wiki/Asynchrony
" In specific terms of digital logic and physical layer of communication, an *asynchronous process does not require a clock signal*, in contrast with synchronous and plesiochronous systems."
"At the higher data link layer of communication, asynchrony is synonym of statistical multiplexing, such as in packet mode. The information transmission may or may not start immediately as requested by sender, the additional delay being caused by medium congestion. Contrast with example of circuit switched communication, which (once circuit is established) allows immediate start of transfer with a guaranteed bit rate. Confusingly, a communication is often synchronous at the physical layer, while being asynchronous at the data link layer."
**" In programming, asynchronous events are those occurring independently of the main program flow. Asynchronous actions are actions executed in a non-blocking scheme, allowing the main program flow to continue processing."**



