By: EduardoS (no.delete@this.spam.com), October 31, 2008 5:36 pm
Room: Moderated Discussions
Michael S (already5chosen@yahoo.com) on 10/30/08 wrote:
---------------------------
>The arrows between the L1D and LSU_1 create an impression that LSUs are fully symmetric
>and can sustain any combination of loads and stores. That's incorrect. K8 L1D cache
>could sustain at most one store per clock.
Hum... It does sustain two stores per clock...
There are some requirements for that like alignment and no bank conflict, store are more restrict than loads, but if all is meet there is no problem in doing two store per clock, it's a simple test, here worked, but we can use other's data:
31 AMD64 :MOV [m64], r64 L: [memory dep.] T: 0.25ns= 0.50c
http://instlatx64.freeweb.hu/InstLatX64_AuthenticAMD0020FB1_K8_Manchester_2000MHz.txt
---------------------------
>The arrows between the L1D and LSU_1 create an impression that LSUs are fully symmetric
>and can sustain any combination of loads and stores. That's incorrect. K8 L1D cache
>could sustain at most one store per clock.
Hum... It does sustain two stores per clock...
There are some requirements for that like alignment and no bank conflict, store are more restrict than loads, but if all is meet there is no problem in doing two store per clock, it's a simple test, here worked, but we can use other's data:
31 AMD64 :MOV [m64], r64 L: [memory dep.] T: 0.25ns= 0.50c
http://instlatx64.freeweb.hu/InstLatX64_AuthenticAMD0020FB1_K8_Manchester_2000MHz.txt