By: Patrick Chase (patrickjchase.delete@this.gmail.com), August 24, 2014 11:11 am
Room: Moderated Discussions
Patrick Chase (patrickjchase.delete@this.gmail.com) on August 24, 2014 11:06 am wrote:
> anon (anon.delete@this.anon.com) on August 22, 2014 5:50 pm wrote:
> > I really don't think following a store with a load to the same location does as
> > much as you think. I doubt it does *anything* that you can rely on, actually.
>
> I think we may be conflating x86 ordering rules and PCI[e] ordering rules in this discussion.
> For PCIe a load will indeed flush preceding stores as Michael assumes. IIRC in x86 the load can
> hit the store buffer leading to exactly the behavior you described in the rest of your post.
One other remark: Keep in mind that a speculative core has to "hold" all stores in local buffers until the corresponding uop retires. If loading from the same address did indeed impose visibility/ordering constraints on the store then that would have require the OoO backend to be flushed up to at least the store. In other words, it would have basically the same cost as a fence.
> anon (anon.delete@this.anon.com) on August 22, 2014 5:50 pm wrote:
> > I really don't think following a store with a load to the same location does as
> > much as you think. I doubt it does *anything* that you can rely on, actually.
>
> I think we may be conflating x86 ordering rules and PCI[e] ordering rules in this discussion.
> For PCIe a load will indeed flush preceding stores as Michael assumes. IIRC in x86 the load can
> hit the store buffer leading to exactly the behavior you described in the rest of your post.
One other remark: Keep in mind that a speculative core has to "hold" all stores in local buffers until the corresponding uop retires. If loading from the same address did indeed impose visibility/ordering constraints on the store then that would have require the OoO backend to be flushed up to at least the store. In other words, it would have basically the same cost as a fence.