Future core architectures. (Was Haskell Compilation Improvement)

By: Seni (seniike.delete@this.hotmail.com), January 3, 2014 1:09 pm
Room: Moderated Discussions
CFP (continual flow pipeline)
CPR (checkpoint / recovery)
WIB (waiting instruction buffer)

Theses are basically aggressive variants of the traditional OOO mechanism.
They have good performance but bad power efficiency... which is why we haven't seen any actually built. If they can solve the efficiency issues, this could be a the way forward.

The principal idea of CFP is to have a really big instruction window to find more parallelism, but all the far-ahead instructions are past multiple unpredictable branches, and the thus have a very high chance of being wrong-path. As such, executing them tends to be a waste of power.

Runahead execution is an alternative championed by SUN. It uses the core itself as a kind of aggressive prefetcher. It has the same problem - fast but hot.

Like any aggressive prefetcher, if frequently prefetches nonsense, wasting a bunch of energy on memory bandwidth for very little gain. Also, many useful calculations are done, thrown away, and then re-done, which means a lot of register file ports burning up more power.

I haven't checked out DOE yet, but I expect it to fall in the same fast-but-hot category.

Maynard Handley (name99.delete@this.name99.org) on January 3, 2014 12:06 am wrote:
> Maynard Handley (name99.delete@this.redheron9.com) on December 31, 2013 5:26 pm wrote:
> > Summary: I think it's true (and mostly agreed) that SW prefetching is dead. What's not widely known is
> > the extent of interesting HW replacements, or the extent to which any of these are yet implemented.
> In the context of what I wrote earlier, there seems to be an emerging consensus around future
> architectures that takes the form of slipping long-latency instructions (and their chains of
> dependent instructions, possibly hundreds to thousands of instructions long) aside to get at
> the independent instructions that can run. The details are now only in exactly how this is done
> so as to replace the ROB with as low power, low area, and low complexity as possible.
> Along these lines we have FlowForward, CPR (Checkpoint Processing and Recovery) and its successor/amplification
> CFP (Continual Flow Processing) and now I see DOE, Disjoint Out-of-Order Execution
> http://j92a21b.ee.ncku.edu.tw/broad/report100/2012-12-24/Disjoint%20OOO%20Execution%20Proc%2012.pdf
> Looked at from a thousand miles up, this last one, in particular, sounds somewhat like Apple's infamous MacroScalar
> stuff I mentioned in my last post... The details and the concentration differs, sure, but the abstract view
> seems, in all these cases, to be to create, on the fly, long long LONG chains of instructions such that all
> instructions in a chain are dependent, but the various chains are independent of each other (except to the extent
> that they fork from the occasional starting point, and join again at join point). Once you have these chains,
> it's a somewhat orthogonal question whether you run them on the same "CPU" (CFP, FlowForward), on kinda sorta
> but not quite the same CPU (MacroScalar), or different CPUs (kinda sorta the DOE stuff).
> Does anyone have an opinion on how real this stuff is? The CPR/CFP/DOE chain of ideas is
> based on people at Intel, but that obviously doesn't mean Intel are ready to bet on it.
> Part of me thinks it would take working silicon from a university (kinda like
> SPARC/MIPS in the 80s) to really validate the idea and make it worth swapping
> in for the tried and trusted OoO ROB engines that everyone uses today.
> And part of me hopes that Apple, as the one player in this space that has not been burned by over-ambition
> on the CPU front, might just be audacious enough to look at these numbers ("hmm, we can get a CPU that's about
> 50% faster than our existing A7, in smaller area and lower power, and that will scale well to higher frequencies.
> Hell, let's take on Intel and ARM head-on and go into the business of selling CPUs to everyone")
> Though I'd be just as happy if nVidia or Qualcomm or AMD were desperate enough to make
> a splash that they took these ideas and ran with them. I've been looking at a bunch of
> ideas for how to handle memory latency, and this collection seem the most promising.
> What worries me is that the gap between our optimized OoO engines today and the retooling you'd need for
> these alternative ideas is so large that it's not easy to get from here to there. I THINK you could do it
> in stages by starting with a simple (hah!) OoO core like an ARM9 or ARM15 and initially just replacing the
> ROB with checkpoints. With that working and giving you, say, 20%, you could then, next generation, replace
> the instruction window/scheduler with the CFP data buffers, giving you another 20% or so, and another sellable
> product, then finally add in the DOE weirdness to add a few more percent and speed up multi-threaded apps.
> But even with this slicing up, each stage is a fairly ambitious engineering project...

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Haskell Compilation ImprovementSymmetry2013/04/09 10:41 AM
  Haskell Compilation ImprovementEric Bron2013/04/09 11:56 AM
  Haskell Compilation ImprovementLinus Torvalds2013/04/09 12:03 PM
    Haskell Compilation ImprovementEduardoS2013/04/09 12:20 PM
      Haskell Compilation ImprovementLinus Torvalds2013/04/09 12:31 PM
        Haskell Compilation ImprovementEduardoS2013/04/09 12:49 PM
    Haskell Compilation Improvement2013/04/11 01:36 AM
      Haskell Compilation ImprovementEric Bron2013/04/11 03:58 AM
        Haskell Compilation ImprovementBrendan2013/04/11 07:06 AM
          Haskell Compilation ImprovementSymmetry2013/04/11 07:45 AM
            Haskell Compilation ImprovementBrendan2013/04/11 11:31 AM
          Haskell Compilation ImprovementEric Bron2013/04/11 08:57 AM
            Haskell Compilation ImprovementBrendan2013/04/11 11:26 AM
              Haskell Compilation ImprovementEric Bron2013/04/11 11:36 AM
                Haskell Compilation ImprovementBrendan2013/04/11 05:00 PM
                  Haskell Compilation ImprovementDavid Kanter2013/04/11 08:50 PM
                    Software prefetching in JVMsGabriele Svelto2013/04/12 03:31 PM
                  Haskell Compilation ImprovementEric Bron2013/04/12 09:12 AM
                    Haskell Compilation ImprovementBrendan2013/04/12 11:40 AM
                      Haskell Compilation ImprovementEric Bron2013/04/12 12:15 PM
                        Haskell Compilation ImprovementBrendan2013/04/12 03:34 PM
                          Haskell Compilation ImprovementEric Bron2013/04/12 10:44 PM
                            Haskell Compilation ImprovementBrendan2013/04/13 02:20 AM
                              Haskell Compilation ImprovementEric Bron2013/04/13 02:32 AM
                                Haskell Compilation ImprovementBrendan2013/04/13 10:18 AM
                                  Haskell Compilation ImprovementEric Bron2013/04/14 01:04 AM
                          Haskell Compilation ImprovementEric Bron2013/04/15 08:34 AM
                            Haskell Compilation ImprovementBrendan2013/04/16 03:26 PM
                              Prefetch compilation testsEric Bron2013/04/21 12:52 AM
        Haskell Compilation Improvementanon2013/04/11 07:14 AM
          Haskell Compilation ImprovementMichael S2013/04/11 07:27 AM
            Haskell Compilation Improvementanon2013/04/11 08:25 AM
              Haskell Compilation ImprovementMichael S2013/04/11 08:37 AM
                Haskell Compilation Improvementbakaneko2013/04/11 09:39 AM
                  Haskell Compilation ImprovementEric Bron2013/04/11 10:08 AM
                    Haskell Compilation Improvementbakaneko2013/04/11 10:36 AM
                    Haskell Compilation Improvementanon2013/04/11 10:54 AM
                      Haskell Compilation ImprovementEric Bron2013/04/11 11:10 AM
                        Haskell Compilation Improvementanon2013/04/11 11:18 AM
                          Haskell Compilation ImprovementEric Bron2013/04/11 11:27 AM
                            Haskell Compilation Improvementanon2013/04/11 12:02 PM
                              Haskell Compilation ImprovementEric Bron2013/04/11 12:09 PM
                                Haskell Compilation ImprovementEric Bron2013/04/11 12:12 PM
                                Haskell Compilation Improvementanon2013/04/11 12:14 PM
                                  Haskell Compilation ImprovementEric Bron2013/04/11 12:30 PM
                                    Haskell Compilation Improvementanon2013/04/11 11:30 PM
                                      Haskell Compilation ImprovementEric Bron2013/04/12 09:25 AM
                                        Haskell Compilation Improvementanon2013/04/12 07:12 PM
                                          Haskell Compilation ImprovementEric Bron2013/04/12 10:51 PM
                                  Prefetch *hints*Konrad Schwarz2013/04/12 08:24 AM
                        Haskell Compilation ImprovementLinus Torvalds2013/04/11 12:56 PM
                          Inherent advantage of software prefetchJouni Osmala2013/04/11 09:41 PM
                            Inherent advantage of software prefetchSeni2013/04/13 03:40 AM
                            Another example: software scatter gather (NT)Megol2013/04/14 02:39 AM
                          Haskell Compilation ImprovementMaynard Handley2013/12/31 05:26 PM
                            Haskell Compilation ImprovementTREZA2013/12/31 05:44 PM
                              Haskell Compilation ImprovementMaynard Handley2013/12/31 07:49 PM
                                Haskell Compilation Improvementanon2013/12/31 10:39 PM
                                  Haskell Compilation ImprovementMaynard Handley2014/01/01 02:04 AM
                                  Haskell Compilation Improvementbakaneko2014/01/01 05:31 AM
                                Haskell Compilation ImprovementGabriele Svelto2014/01/02 07:57 AM
                                  Haskell Compilation ImprovementMichael S2014/01/02 08:37 AM
                                    Haskell Compilation ImprovementGabriele Svelto2014/01/02 10:09 AM
                                    Haskell Compilation ImprovementTREZA2014/01/02 12:43 PM
                            Haskell Compilation ImprovementMaynard Handley2013/12/31 06:07 PM
                            Future core architectures. (Was Haskell Compilation Improvement)Maynard Handley2014/01/03 12:06 AM
                              Speculative multi-threadingDavid Kanter2014/01/03 02:12 AM
                                Speculative multi-threadingMaynard Handley2014/01/03 05:01 AM
                              Future core architectures. (Was Haskell Compilation Improvement)Seni2014/01/03 01:09 PM
                              Future core architectures. (Was Haskell Compilation Improvement)Linus Torvalds2014/01/03 01:27 PM
                            Haskell Compilation ImprovementKonrad Schwarz2014/01/04 09:38 AM
              Haskell Compilation ImprovementEric Bron2013/04/11 09:23 AM
          Haskell Compilation ImprovementEric Bron2013/04/11 08:50 AM
            Haskell Compilation ImprovementEugene Nalimov2013/04/11 09:20 AM
              Haskell Compilation ImprovementEric Bron2013/04/11 09:28 AM
                Haskell Compilation ImprovementEduardoS2013/04/11 07:30 PM
            Haskell Compilation Improvementanon2013/04/11 10:19 AM
              Haskell Compilation ImprovementEric Bron2013/04/11 10:30 AM
                Haskell Compilation Improvementanon2013/04/11 10:50 AM
                  Haskell Compilation ImprovementEric Bron2013/04/11 11:03 AM
                    Haskell Compilation Improvementanon2013/04/11 11:16 AM
                      Haskell Compilation ImprovementEric Bron2013/04/11 11:24 AM
                        Haskell Compilation Improvementanon2013/04/11 12:09 PM
                          Haskell Compilation ImprovementEric Bron2013/04/11 12:43 PM
                            Haskell Compilation Improvementanon2013/04/11 11:27 PM
                              Haskell Compilation ImprovementEric Bron2013/04/12 12:15 AM
                                Haskell Compilation Improvementanon2013/04/12 07:14 PM
                                  Haskell Compilation ImprovementEric Bron2013/04/12 11:01 PM
                      Haskell Compilation ImprovementLinus Torvalds2013/04/11 01:05 PM
                        Haskell Compilation Improvementanon2013/04/11 10:42 PM
                        Haskell Compilation ImprovementRobert David Graham2013/04/12 02:12 PM
        Software prefetch architecturePaul A. Clayton2013/04/11 08:54 AM
          Software prefetch architectureEric Bron2013/04/11 09:06 AM
            Software prefetch architectureMegol2013/04/15 11:03 AM
              Software prefetch architectureEric Bron2013/04/15 11:30 AM
  low barMichael S2013/04/09 04:38 PM
Reply to this Topic
Body: No Text
How do you spell avocado?