By: Michael S (already5chosen.delete@this.yahoo.com), November 1, 2008 8:21 am
Room: Moderated Discussions
EduardoS (no@spam.com) on 10/31/08 wrote:
---------------------------
>Michael S (already5chosen@yahoo.com) on 10/30/08 wrote:
>---------------------------
>>The arrows between the L1D and LSU_1 create an impression that LSUs are fully symmetric
>>and can sustain any combination of loads and stores. That's incorrect. K8 L1D cache
>>could sustain at most one store per clock.
>
>Hum... It does sustain two stores per clock...
>There are some requirements for that like alignment and no bank conflict, store
>are more restrict than loads, but if all is meet there is no problem in doing two
>store per clock, it's a simple test, here worked, but we can use other's data:
>
>31 AMD64 :MOV [m64], r64 L: [memory dep.] T: 0.25ns= 0.50c
>
>http://instlatx64.freeweb.hu/InstLatX64_AuthenticAMD0020FB1_K8_Manchester_2000MHz.txt
I think there is some mistake in their measurement methodology. It is damn hard to believe that 64-bit moves have higher throughput than 32-bit moves.
Right now I have no access to K8 running 64-bit OS.
Give me couple of days, I'll test that when back at work.
---------------------------
>Michael S (already5chosen@yahoo.com) on 10/30/08 wrote:
>---------------------------
>>The arrows between the L1D and LSU_1 create an impression that LSUs are fully symmetric
>>and can sustain any combination of loads and stores. That's incorrect. K8 L1D cache
>>could sustain at most one store per clock.
>
>Hum... It does sustain two stores per clock...
>There are some requirements for that like alignment and no bank conflict, store
>are more restrict than loads, but if all is meet there is no problem in doing two
>store per clock, it's a simple test, here worked, but we can use other's data:
>
>31 AMD64 :MOV [m64], r64 L: [memory dep.] T: 0.25ns= 0.50c
>
>http://instlatx64.freeweb.hu/InstLatX64_AuthenticAMD0020FB1_K8_Manchester_2000MHz.txt
I think there is some mistake in their measurement methodology. It is damn hard to believe that 64-bit moves have higher throughput than 32-bit moves.
Right now I have no access to K8 running 64-bit OS.
Give me couple of days, I'll test that when back at work.