By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), July 7, 2015 4:40 pm
Room: Moderated Discussions
someotherdude (someotherdude.delete@this.none.none.none.none) on July 7, 2015 4:01 pm wrote:
> Paul A. Clayton (paaronclayton.delete@this.gmail.com) on July 7, 2015 7:43 am wrote:
> > Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on July 7, 2015 2:50 am wrote:
> > > SHK (no.delete@this.mail.com) on July 6, 2015 12:41 pm wrote:
> > > > Maybe that kind of hardware optimization is there for old non-recompiled code?
> > >
> > > It also works for stores which you cannot handle with isel.
> >
> > To "predicate" a store, one selects the address (either the ordinary address or a
> > safe but unused address ) and performs the store. A store is more expensive than just
> > an ALU operation, but it could still be cheaper than a branch misprediction.
>
>
> I see that you put predicate in quotes, but I think you underweight the cost of doing an actual store.
> You're discounting the possibilities and costs of a TLB miss, a cache miss, a page miss, etc. every one
> of which could be much greater in cost than the branch mispredict, perhaps even by magnitudes.
The fact is that such stores show large gains - you're wildly exaggerating the low cost of a store to a local on the stack. Just imagine a tight loop where you can have at most once TLB/cache miss when using select on the store address but millions of branch mispredicts if you use a branch over the store.
Wilco
> Paul A. Clayton (paaronclayton.delete@this.gmail.com) on July 7, 2015 7:43 am wrote:
> > Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on July 7, 2015 2:50 am wrote:
> > > SHK (no.delete@this.mail.com) on July 6, 2015 12:41 pm wrote:
> > > > Maybe that kind of hardware optimization is there for old non-recompiled code?
> > >
> > > It also works for stores which you cannot handle with isel.
> >
> > To "predicate" a store, one selects the address (either the ordinary address or a
> > safe but unused address ) and performs the store. A store is more expensive than just
> > an ALU operation, but it could still be cheaper than a branch misprediction.
>
>
> I see that you put predicate in quotes, but I think you underweight the cost of doing an actual store.
> You're discounting the possibilities and costs of a TLB miss, a cache miss, a page miss, etc. every one
> of which could be much greater in cost than the branch mispredict, perhaps even by magnitudes.
The fact is that such stores show large gains - you're wildly exaggerating the low cost of a store to a local on the stack. Just imagine a tight loop where you can have at most once TLB/cache miss when using select on the store address but millions of branch mispredicts if you use a branch over the store.
Wilco