By: Mark Roulo (nothanks.delete@this.xxx.com), June 6, 2022 2:10 pm
Room: Moderated Discussions
rwessel (rwessel.delete@this.yahoo.com) on June 6, 2022 11:57 am wrote:
> Doug S (foo.delete@this.bar.bar) on June 6, 2022 10:44 am wrote:
> > rwessel (rwessel.delete@this.yahoo.com) on June 6, 2022 5:26 am wrote:
> > > Peter Lewis (peter.delete@this.notyahoo.com) on June 6, 2022 3:55 am wrote:
> > > > > There's PGO of course, but that really only works for interpreted or jitted languages.
> > > >
> > > > The Intel C/C++ Compiler has Profile-Guided Optimization (PGO).
> > > > Have you had some bad experience with PGO for compiled languages?
> > >
> > >
> > > The fundamental problem with PGO is that it's (probably fundamentally) too hard to use the
> > > vast majority of the time, at least with languages compiled in the traditional way.
> > >
> > > Unless you have simple to generate (and maintain!*) training datasets, have a very small area
> > > where you can separately apply PGO (and can make the training sets small enough to make them
> > > maintainable), or you can afford to invest in a quite large infrastructure to use and maintain
> > > PGO (because, say, you have an absurd number of machine on which you're going to run this code
> > > - consider Google), PGO is just too hard to use, and so is useless 99% of the time.
> > >
> > > As Peter pointed out, JIT'd languages can take advantage of PGO as well.
> > >
> > > *Code with limited lifespan, or code you somehow know isn't going to
> > > change in the future, can reduce the maintenance requirements here.
> >
> >
> > Has anyone ever tried having the CPU's branch predictor collecting info
> > the OS can use to 'update' binaries with branch prediction info?
> >
> > Having data that is generated by the end user tweak their binaries' branch prediction seems like a
> > better solution than having the developer try to come up with training data for the PGO phase. I know
> > PGO does more than just predict branches, but in the case just limiting it to the topic at hand.
> >
> > I'm not sure how it would be implemented - you probably wouldn't actually update the binary
> > itself (it may be on read only storage or shared by others) so there would need to be some
> > sort of auxiliary data file (maybe stored somewhere like /var/lib or in the user's home directory
> > / profile?) that would be used to tweak things when the executable is loaded.
> >
> > Just an idle thought here, I haven't really considered it more than the few minutes than
> > it took to write this post so I could be missing some really big gotchas with this idea!
>
>
> Not quite the same thing, but efforts have been made to save JIT'd code for use the next time.
Example here: https://docs.oracle.com/cd/E13188_01/jrockit/docs142/userguide/codecach.html
Java 1.4 is quite old and I don't know if they still do this. I remember reading at the time that the overhead of *managing* the cache was slower than just re-JITing. But maybe I remember wrong or maybe things got better.
> Doug S (foo.delete@this.bar.bar) on June 6, 2022 10:44 am wrote:
> > rwessel (rwessel.delete@this.yahoo.com) on June 6, 2022 5:26 am wrote:
> > > Peter Lewis (peter.delete@this.notyahoo.com) on June 6, 2022 3:55 am wrote:
> > > > > There's PGO of course, but that really only works for interpreted or jitted languages.
> > > >
> > > > The Intel C/C++ Compiler has Profile-Guided Optimization (PGO).
> > > > Have you had some bad experience with PGO for compiled languages?
> > >
> > >
> > > The fundamental problem with PGO is that it's (probably fundamentally) too hard to use the
> > > vast majority of the time, at least with languages compiled in the traditional way.
> > >
> > > Unless you have simple to generate (and maintain!*) training datasets, have a very small area
> > > where you can separately apply PGO (and can make the training sets small enough to make them
> > > maintainable), or you can afford to invest in a quite large infrastructure to use and maintain
> > > PGO (because, say, you have an absurd number of machine on which you're going to run this code
> > > - consider Google), PGO is just too hard to use, and so is useless 99% of the time.
> > >
> > > As Peter pointed out, JIT'd languages can take advantage of PGO as well.
> > >
> > > *Code with limited lifespan, or code you somehow know isn't going to
> > > change in the future, can reduce the maintenance requirements here.
> >
> >
> > Has anyone ever tried having the CPU's branch predictor collecting info
> > the OS can use to 'update' binaries with branch prediction info?
> >
> > Having data that is generated by the end user tweak their binaries' branch prediction seems like a
> > better solution than having the developer try to come up with training data for the PGO phase. I know
> > PGO does more than just predict branches, but in the case just limiting it to the topic at hand.
> >
> > I'm not sure how it would be implemented - you probably wouldn't actually update the binary
> > itself (it may be on read only storage or shared by others) so there would need to be some
> > sort of auxiliary data file (maybe stored somewhere like /var/lib or in the user's home directory
> > / profile?) that would be used to tweak things when the executable is loaded.
> >
> > Just an idle thought here, I haven't really considered it more than the few minutes than
> > it took to write this post so I could be missing some really big gotchas with this idea!
>
>
> Not quite the same thing, but efforts have been made to save JIT'd code for use the next time.
Example here: https://docs.oracle.com/cd/E13188_01/jrockit/docs142/userguide/codecach.html
Java 1.4 is quite old and I don't know if they still do this. I remember reading at the time that the overhead of *managing* the cache was slower than just re-JITing. But maybe I remember wrong or maybe things got better.