Article: ARM Goes 64-bit
By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), November 21, 2012 7:04 pm
Room: Moderated Discussions
Mark Roulo (nothanks.delete@this.xxx.com) on November 21, 2012 10:30 am wrote:
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on November 21, 2012 6:41 am wrote:
> > David Kanter (dkanter.delete@this.realworldtech.com) on November 20, 2012 11:07 pm wrote:
> > > Rohit (a.delete@this.b.c) on November 20, 2012 4:48 pm wrote:
> > > > David Kanter (dkanter.delete@this.realworldtech.com) on November 19, 2012 3:40 pm wrote:
> > > > > That's probably a common view. However, there are a number of optimizations that can net
> > > > > you huge performance gains and are only viable for JIT systems. I agree you want to do
> > > > > as much as possible in your compiler, but somethings require dynamic information.
> > > > >
> > > > > DK
> > > >
> > > > Which optimizations are these?
> > >
> > > Speculative multi-threading using a JIT.
> >
> > That's not at all JIT related. Has anyone ever shown a speedup using software-only speculation?
> > Automatic multihreading ARMs have been studied for many years. However even with special hardware
> > support to detect conflicts, fast rollbacks and spawning threads to different cores it never
> > seemed worth the relatively small speedups, especially not when you consider the extra power
> > for speculating. The big speedups were all in loops which can be dealt with normally.
> >
> > To repeat, I don't believe there are any optimizations which are much better done in a JIT than in
> > a static compiler. Even dynamic info like profiling works better when done the traditional way.
>
> A better example is the ability to inline through a virtual
> function based on the actual objects instantiated.
>
> As a real-world example, I once wrote a logging infrastructure in Java. The way it worked, if the logging
> was disabled at run-time, a no-op logging object was instantiated. Because the JIT could see that only
> one class (of an inheritance hierarchy) was instantiated, the calls to that object did not need to go through
> the V-Table. This allowed them to be inlined, which then allowed the JIT to further realize that nothing
> was happening, so the entire log call could be optimized away when logging was disabled.
>
> I don't think you can do this with a static compiler ... you can decide at compile-time to optimize
> away, but not at run-time (or if you can, I don't see how ...). Note also that this allowed for
> fine grained control ... logging could be enabled (at runtime, based on a config file) per-component.
> Again, the components for which logging was disabled had *NO* runtime overhead at all.
If you have the ability to dynamically switch between logging and no logging at runtime then you always have some overhead. Profile feedback indicates which virtual call is most common, and compilers add a check for that class and inline the call if it matches, and if not do the virtual call as usual. A JIT will have to do the same as at any point in time you could change the logging object.
If the language used supports it, you could specialize each function to a particular instance of the logging object. You track every write to the logging object and a JIT would flush its cache on a write. A static compiler would jump to the alternative set of functions. This has high overheads, causing huge code bloat and (for a JIT) repeated unnecessary retranslations, so except for special cases (eg. path specialization for GC) this approach isn't popular.
In reality for logging it's simpler just to have a debug and a release build...
Wilco
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on November 21, 2012 6:41 am wrote:
> > David Kanter (dkanter.delete@this.realworldtech.com) on November 20, 2012 11:07 pm wrote:
> > > Rohit (a.delete@this.b.c) on November 20, 2012 4:48 pm wrote:
> > > > David Kanter (dkanter.delete@this.realworldtech.com) on November 19, 2012 3:40 pm wrote:
> > > > > That's probably a common view. However, there are a number of optimizations that can net
> > > > > you huge performance gains and are only viable for JIT systems. I agree you want to do
> > > > > as much as possible in your compiler, but somethings require dynamic information.
> > > > >
> > > > > DK
> > > >
> > > > Which optimizations are these?
> > >
> > > Speculative multi-threading using a JIT.
> >
> > That's not at all JIT related. Has anyone ever shown a speedup using software-only speculation?
> > Automatic multihreading ARMs have been studied for many years. However even with special hardware
> > support to detect conflicts, fast rollbacks and spawning threads to different cores it never
> > seemed worth the relatively small speedups, especially not when you consider the extra power
> > for speculating. The big speedups were all in loops which can be dealt with normally.
> >
> > To repeat, I don't believe there are any optimizations which are much better done in a JIT than in
> > a static compiler. Even dynamic info like profiling works better when done the traditional way.
>
> A better example is the ability to inline through a virtual
> function based on the actual objects instantiated.
>
> As a real-world example, I once wrote a logging infrastructure in Java. The way it worked, if the logging
> was disabled at run-time, a no-op logging object was instantiated. Because the JIT could see that only
> one class (of an inheritance hierarchy) was instantiated, the calls to that object did not need to go through
> the V-Table. This allowed them to be inlined, which then allowed the JIT to further realize that nothing
> was happening, so the entire log call could be optimized away when logging was disabled.
>
> I don't think you can do this with a static compiler ... you can decide at compile-time to optimize
> away, but not at run-time (or if you can, I don't see how ...). Note also that this allowed for
> fine grained control ... logging could be enabled (at runtime, based on a config file) per-component.
> Again, the components for which logging was disabled had *NO* runtime overhead at all.
If you have the ability to dynamically switch between logging and no logging at runtime then you always have some overhead. Profile feedback indicates which virtual call is most common, and compilers add a check for that class and inline the call if it matches, and if not do the virtual call as usual. A JIT will have to do the same as at any point in time you could change the logging object.
If the language used supports it, you could specialize each function to a particular instance of the logging object. You track every write to the logging object and a JIT would flush its cache on a write. A static compiler would jump to the alternative set of functions. This has high overheads, causing huge code bloat and (for a JIT) repeated unnecessary retranslations, so except for special cases (eg. path specialization for GC) this approach isn't popular.
In reality for logging it's simpler just to have a debug and a release build...
Wilco
Topic | Posted By | Date |
---|---|---|
New Article: ARM Goes 64-bit | David Kanter | 2012/08/14 12:04 AM |
New Article: ARM Goes 64-bit | none | 2012/08/14 12:44 AM |
New Article: ARM Goes 64-bit | David Kanter | 2012/08/14 01:04 AM |
MIPS MT-ASE | Paul A. Clayton | 2012/08/14 09:01 AM |
MONITOR/MWAIT | EduardoS | 2012/08/14 10:08 AM |
MWAIT not specifically MT | Paul A. Clayton | 2012/08/14 10:36 AM |
MWAIT not specifically MT | EduardoS | 2012/08/15 03:16 PM |
MONITOR/MWAIT | anonymou5 | 2012/08/14 11:07 AM |
MONITOR/MWAIT | EduardoS | 2012/08/15 03:20 PM |
MIPS MT-ASE | rwessel | 2012/08/14 10:14 AM |
New Article: ARM Goes 64-bit | SHK | 2012/08/14 02:01 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 02:37 AM |
New Article: ARM Goes 64-bit | Richard Cownie | 2012/08/14 03:57 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 04:29 AM |
New Article: ARM Goes 64-bit | none | 2012/08/14 04:44 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 05:28 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 05:32 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/08/14 06:06 AM |
New Article: ARM Goes 64-bit | none | 2012/08/14 05:40 AM |
AArch64 select better than cmov | Paul A. Clayton | 2012/08/14 06:08 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 06:12 AM |
New Article: ARM Goes 64-bit | none | 2012/08/14 06:25 AM |
Predicated ld/store are useful | Paul A. Clayton | 2012/08/14 06:48 AM |
Predicated ld/store are useful | none | 2012/08/14 06:56 AM |
Predicated ld/store are useful | anon | 2012/08/14 07:07 AM |
Predicated stores might not be that bad | Paul A. Clayton | 2012/08/14 07:27 AM |
Predicated stores might not be that bad | David Kanter | 2012/08/15 01:14 AM |
Predicated stores might not be that bad | Michael S | 2012/08/15 11:41 AM |
Predicated stores might not be that bad | R Byron | 2012/08/17 04:09 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 06:54 AM |
New Article: ARM Goes 64-bit | none | 2012/08/14 07:04 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 07:43 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/08/14 06:07 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 06:20 AM |
New Article: ARM Goes 64-bit | none | 2012/08/14 06:29 AM |
New Article: ARM Goes 64-bit | anon | 2012/08/14 07:00 AM |
New Article: ARM Goes 64-bit | Michael S | 2012/08/14 03:43 PM |
New Article: ARM Goes 64-bit | Richard Cownie | 2012/08/14 06:53 AM |
OT: Conrad's "Youth" | Richard Cownie | 2012/08/14 07:20 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/08/14 06:04 AM |
New Article: ARM Goes 64-bit | mpx | 2012/08/14 08:59 AM |
New Article: ARM Goes 64-bit | Antti-Ville Tuunainen | 2012/08/14 09:16 AM |
New Article: ARM Goes 64-bit | anonymou5 | 2012/08/14 11:03 AM |
New Article: ARM Goes 64-bit | name99 | 2012/11/17 03:31 PM |
Microarchitecting a counter register | Paul A. Clayton | 2012/11/17 07:37 PM |
New Article: ARM Goes 64-bit | bakaneko | 2012/08/14 04:21 AM |
New Article: ARM Goes 64-bit | name99 | 2012/11/17 03:40 PM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/17 04:52 PM |
New Article: ARM Goes 64-bit | Doug S | 2012/11/17 05:48 PM |
New Article: ARM Goes 64-bit | bakaneko | 2012/11/18 05:40 PM |
New Article: ARM Goes 64-bit | Wilco | 2012/11/19 07:59 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/19 08:23 AM |
New Article: ARM Goes 64-bit | Wilco | 2012/11/19 09:31 AM |
Downloading µarch-specific binaries? | Paul A. Clayton | 2012/11/19 11:21 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/19 11:41 AM |
New Article: ARM Goes 64-bit | Wilco | 2012/11/21 07:44 AM |
JIT vs. static compilation (Was: New Article: ARM Goes 64-bit) | VMguy | 2012/11/22 03:21 AM |
JIT vs. static compilation (Was: New Article: ARM Goes 64-bit) | David Kanter | 2012/11/22 12:12 PM |
JIT vs. static compilation (Was: New Article: ARM Goes 64-bit) | Gabriele Svelto | 2012/11/23 03:50 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/23 10:09 AM |
New Article: ARM Goes 64-bit | EBFE | 2012/11/26 01:24 AM |
New Article: ARM Goes 64-bit | Gabriele Svelto | 2012/11/26 03:33 AM |
New Article: ARM Goes 64-bit | EBFE | 2012/11/27 11:17 PM |
New Article: ARM Goes 64-bit | Gabriele Svelto | 2012/11/28 02:32 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/26 12:16 PM |
New Article: ARM Goes 64-bit | EBFE | 2012/11/28 12:33 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/28 05:53 AM |
New Article: ARM Goes 64-bit | Michael S | 2012/11/28 06:15 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/28 07:33 AM |
New Article: ARM Goes 64-bit | Michael S | 2012/11/28 09:16 AM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/28 09:53 AM |
New Article: ARM Goes 64-bit | Eugene Nalimov | 2012/11/28 05:58 PM |
Amazing! | EduardoS | 2012/11/28 07:25 PM |
Amazing! (non-italic response) | EduardoS | 2012/11/28 07:25 PM |
Amazing! | EBFE | 2012/11/28 08:20 PM |
Undefined behaviour doubles down | EduardoS | 2012/11/28 09:10 PM |
New Article: ARM Goes 64-bit | EBFE | 2012/11/28 07:54 PM |
New Article: ARM Goes 64-bit | EduardoS | 2012/11/28 09:21 PM |
Have you heard of Transmeta? | David Kanter | 2012/11/19 03:47 PM |
New Article: ARM Goes 64-bit | bakaneko | 2012/11/19 09:08 AM |
New Article: ARM Goes 64-bit | David Kanter | 2012/11/19 03:40 PM |
Semantic Dictionary Encoding | Ray | 2012/11/19 10:37 PM |
New Article: ARM Goes 64-bit | Rohit | 2012/11/20 04:48 PM |
New Article: ARM Goes 64-bit | David Kanter | 2012/11/20 11:07 PM |
New Article: ARM Goes 64-bit | Wilco | 2012/11/21 06:41 AM |
New Article: ARM Goes 64-bit | David Kanter | 2012/11/21 10:12 AM |
A JIT example | Mark Roulo | 2012/11/21 10:30 AM |
A JIT example | Wilco | 2012/11/21 07:04 PM |
A JIT example | rwessel | 2012/11/21 09:05 PM |
A JIT example | Gabriele Svelto | 2012/11/23 03:53 AM |
A JIT example | EduardoS | 2012/11/23 10:13 AM |
A JIT example | Wilco | 2012/11/23 01:41 PM |
A JIT example | EduardoS | 2012/11/23 02:06 PM |
A JIT example | Gabriele Svelto | 2012/11/23 04:09 PM |
A JIT example | Symmetry | 2012/11/26 05:58 AM |
New Article: ARM Goes 64-bit | Ray | 2012/11/19 10:27 PM |
New Article: ARM Goes 64-bit | David Kanter | 2012/08/14 09:11 AM |
v7-M is Thumb-only | Paul A. Clayton | 2012/08/14 06:58 AM |
Minor suggested correction | Paul A. Clayton | 2012/08/14 08:33 AM |
Minor suggested correction | anon | 2012/08/14 08:57 AM |
New Article: ARM Goes 64-bit | Exophase | 2012/08/14 08:33 AM |
New Article: ARM Goes 64-bit | David Kanter | 2012/08/14 09:16 AM |
New Article: ARM Goes 64-bit | jigal | 2012/08/15 01:49 PM |
Correction re ARM and BBC Micro | Paul | 2012/08/14 08:59 PM |
Correction re ARM and BBC Micro | Per Hesselgren | 2012/08/15 03:27 AM |
Memory BW so low | Per Hesselgren | 2012/08/15 03:14 AM |
Memory BW so low | none | 2012/08/15 11:16 AM |
New Article: ARM Goes 64-bit | dado | 2012/08/15 10:25 AM |
Number of GPRs | Kenneth Jonsson | 2012/08/16 02:35 PM |
Number of GPRs | Exophase | 2012/08/16 02:52 PM |
Number of GPRs | Kenneth Jonsson | 2012/08/17 02:41 AM |
Ooops, missing link... | Kenneth Jonsson | 2012/08/17 02:44 AM |
64-bit pointers eat some performance | Paul A. Clayton | 2012/08/17 06:19 AM |
64-bit pointers eat some performance | bakaneko | 2012/08/17 08:37 AM |
Brute force seems to work | Paul A. Clayton | 2012/08/17 10:08 AM |
Brute force seems to work | bakaneko | 2012/08/17 11:15 AM |
64-bit pointers eat some performance | Richard Cownie | 2012/08/17 08:46 AM |
Pointer compression is atypical | Paul A. Clayton | 2012/08/17 10:43 AM |
Pointer compression is atypical | Richard Cownie | 2012/08/17 12:57 PM |
Pointer compression is atypical | Howard Chu | 2012/08/22 10:17 PM |
Pointer compression is atypical | Richard Cownie | 2012/08/23 04:48 AM |
Pointer compression is atypical | Howard Chu | 2012/08/23 06:51 AM |
Pointer compression is atypical | Wilco | 2012/08/17 02:41 PM |
Pointer compression is atypical | Richard Cownie | 2012/08/17 04:13 PM |
Pointer compression is atypical | Ricardo B | 2012/08/19 10:44 AM |
Pointer compression is atypical | Howard Chu | 2012/08/22 10:08 PM |
Unified libraries? | Paul A. Clayton | 2012/08/23 07:49 AM |
Pointer compression is atypical | Richard Cownie | 2012/08/23 08:44 AM |
Pointer compression is atypical | Howard Chu | 2012/08/23 05:17 PM |
Pointer compression is atypical | anon | 2012/08/23 08:15 PM |
Pointer compression is atypical | Howard Chu | 2012/08/23 09:33 PM |
64-bit pointers eat some performance | Foo_ | 2012/08/18 12:09 PM |
64-bit pointers eat some performance | Richard Cownie | 2012/08/18 05:25 PM |
64-bit pointers eat some performance | Richard Cownie | 2012/08/18 05:32 PM |
Page-related benefit of small pointers | Paul A. Clayton | 2012/08/23 08:36 AM |
Number of GPRs | Wilco | 2012/08/17 06:31 AM |
Number of GPRs | Kenneth Jonsson | 2012/08/17 11:54 AM |
Number of GPRs | Exophase | 2012/08/17 12:44 PM |
Number of GPRs | Kenneth Jonsson | 2012/08/17 01:22 PM |
Number of GPRs | Wilco | 2012/08/17 02:53 PM |
What about dynamic utilization? | Exophase | 2012/08/17 09:30 AM |
Compiler vs. assembly aliasing knowledge? | Paul A. Clayton | 2012/08/17 10:20 AM |
Compiler vs. assembly aliasing knowledge? | Exophase | 2012/08/17 11:09 AM |
Compiler vs. assembly aliasing knowledge? | anon | 2012/08/18 02:23 AM |
Compiler vs. assembly aliasing knowledge? | Ricardo B | 2012/08/19 11:02 AM |
Compiler vs. assembly aliasing knowledge? | anon | 2012/08/19 06:07 PM |
Compiler vs. assembly aliasing knowledge? | Ricardo B | 2012/08/19 07:26 PM |
Compiler vs. assembly aliasing knowledge? | anon | 2012/08/19 10:03 PM |
Compiler vs. assembly aliasing knowledge? | anon | 2012/08/20 01:59 AM |
Number of GPRs | David Kanter | 2012/08/17 12:46 PM |
RAT issues as part of reason 1 | Paul A. Clayton | 2012/08/17 02:18 PM |
Number of GPRs | name99 | 2012/11/17 06:37 PM |
Large ARFs increase renaming cost | Paul A. Clayton | 2012/11/17 09:23 PM |
Number of GPRs | David Kanter | 2012/08/16 03:31 PM |
Number of GPRs | Richard Cownie | 2012/08/16 05:17 PM |
32 GPRs ~2-3% | Paul A. Clayton | 2012/08/16 06:27 PM |
Oops, Message-ID: aaed6e38-c7bd-467e-ba41-f40cf1020e5e@googlegroups.com (NT) | Paul A. Clayton | 2012/08/16 06:29 PM |
32 GPRs ~2-3% | Exophase | 2012/08/16 10:06 PM |
R31 as SP/zero is kind of neat (NT) | Paul A. Clayton | 2012/08/17 06:23 AM |
32 GPRs ~2-3% | rwessel | 2012/08/17 08:24 AM |
32 GPRs ~2-3% | Exophase | 2012/08/17 09:16 AM |
32 GPRs ~2-3% | Max | 2012/08/17 04:19 PM |
32 GPRs ~2-3% | name99 | 2012/11/17 07:43 PM |
Number of GPRs | mpx | 2012/08/17 01:11 AM |
Latency and power | Paul A. Clayton | 2012/08/17 06:54 AM |
Number of GPRs | bakaneko | 2012/08/17 03:09 AM |
New Article: ARM Goes 64-bit | Steve | 2012/08/17 02:12 PM |
New Article: ARM Goes 64-bit | David Kanter | 2012/08/19 12:42 PM |
New Article: ARM Goes 64-bit | Doug S | 2012/08/19 02:02 PM |
New Article: ARM Goes 64-bit | Anon | 2012/08/19 07:16 PM |
New Article: ARM Goes 64-bit | Steve | 2012/08/30 07:51 AM |
Scalar vs Vector registers | Robert David Graham | 2012/08/19 05:19 PM |
Scalar vs Vector registers | David Kanter | 2012/08/19 05:29 PM |
New Article: ARM Goes 64-bit | Baserock ARM servers | 2012/08/21 04:13 PM |
Baserock ARM servers | Sysanon | 2012/08/21 04:14 PM |
A-15 virtualization and LPAE? | Paul A. Clayton | 2012/08/21 06:13 PM |
A-15 virtualization and LPAE? | Anon | 2012/08/21 07:13 PM |
Half-depth advantages? | Paul A. Clayton | 2012/08/21 08:42 PM |
Half-depth advantages? | Anon | 2012/08/22 03:33 PM |
Thanks for the information (NT) | Paul A. Clayton | 2012/08/22 04:04 PM |
A-15 virtualization and LPAE? | C. Ladisch | 2012/08/23 11:12 AM |
A-15 virtualization and LPAE? | Paul | 2012/08/23 03:17 PM |
Excessive pessimism | Paul A. Clayton | 2012/08/23 04:08 PM |
Excessive pessimism | David Kanter | 2012/08/23 05:05 PM |
New Article: ARM Goes 64-bit | Michael S | 2012/08/22 07:12 AM |
BTW, Baserock==product, Codethink==company (NT) | Paul A. Clayton | 2012/08/22 08:56 AM |
New Article: ARM Goes 64-bit | Reinoud Zandijk | 2012/08/21 11:27 PM |
New Article: ARM Goes 64-bit | Robert Pearson | 2021/07/26 09:11 AM |
New Article: ARM Goes 64-bit | anon | 2021/07/26 11:03 AM |
New Article: ARM Goes 64-bit | none | 2021/07/26 11:45 PM |
New Article: ARM Goes 64-bit | dmcq | 2021/07/27 07:36 AM |
New Article: ARM Goes 64-bit | Chester | 2021/07/27 01:21 PM |
New Article: ARM Goes 64-bit | none | 2021/07/27 10:37 PM |
New Article: ARM Goes 64-bit | anon | 2021/07/26 11:04 AM |