By: Travis (travis.downs.delete@this.gmail.com), December 27, 2017 3:33 pm
Room: Moderated Discussions
Nksingg (None.delete@this.none.non) on December 26, 2017 5:47 am wrote:
> If you're implementing x86 ordering, wouldn't you need to put store buffer entries into
> a FIFO queue to allow incremental draining of the stores? Doing a store to a single location
> would serialize around that store's queued buffer and all later stores would now have
> a dependency on that store location. I'd expect ARM to not have this issue.
>
If I understood you correctly, the idea is that repeatedly writing to the same location in L1 might be slower than writing to different locations.
I wrote another test, write3, to test this idea: it writes to stride-1 byte locations instead of the same location for the "L1 store". It performed the same as the other version that wrote to a fixed location.
> If you're implementing x86 ordering, wouldn't you need to put store buffer entries into
> a FIFO queue to allow incremental draining of the stores? Doing a store to a single location
> would serialize around that store's queued buffer and all later stores would now have
> a dependency on that store location. I'd expect ARM to not have this issue.
>
If I understood you correctly, the idea is that repeatedly writing to the same location in L1 might be slower than writing to different locations.
I wrote another test, write3, to test this idea: it writes to stride-1 byte locations instead of the same location for the "L1 store". It performed the same as the other version that wrote to a fixed location.