Itanium and static vs dynamic

By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), January 9, 2019 8:51 am
Room: Moderated Discussions
Heikki Kultala (heikki.kultala.delete@this.tuni.fi) on January 9, 2019 2:14 am wrote:
> Michael S (already5chosen.delete@this.yahoo.com) on January 8, 2019 6:44 am wrote:
> > Heikki Kultala (heikki.kultala.delete@this.tuni.fi) on January 8, 2019 3:19 am wrote:
[snip]
>>> Itanic was _never_ fully statically scheduled. ALL models of Itanic have been dynamically scheduled.
>>
>> Itanium never had exposed pipeline, yes. In presence of scheduling conflicts a
>> HW assures correct answer by means of pipeline stalls/interlocks or replay.
>> Other than that it's still very much statically scheduled.
>
> The IA-64 instruction set only defines that the hardware is free to execute all operations in a
> bundle in parallel. AFAIK There is absolutely nothing in the instruction set forcing it.

If fact, the instruction set specifically requires that single-stepping be possible.

(Side comment: the "pushing" VLIWs, where operations only made progress when a subsequent operation in its lane demanded the stage, seem really weird and software driven.)

> On some other, more static architectures, the instruction set forces the schedule because
> for example some values get overwritten if they are not read at the exactly correct time.
>
> And multiple different IA-64 microarchitectures can execute the same IA-64 code, with
> very different schedules. With static scheduling this would not be possible.
>
>> With that I agree.
>> Apart from non-exposed pipeline IPF has no register banks that one would expect in VLIW of
>> similar width.
>
> The Itanium processor microarhitectures might even have register banks,
> but there are no register banks exposed in the instruction set.

In a very limited sense, the load/store pair instruction exposes FP register banks since the two registers must be even/odd (or odd/even) architectural registers. I.e., the compiler must be sure not to have register rotation causing an odd/odd or even/even pairing.

I do not think any implementation exploited this.

> Also typically in VLIW the operations are bound to execution units statically, and this is visible
> in the instruction word. In EPIC they are not, the instruction word only contains which operation
> is to be executed, and hardware is free to select the execution units for operations dynamically.

Itanium did try to provide simpler operation/instruction routing with the template bits, so it might be considered an intermediate design point between traditional VLIW and arbitrary RISC operation stream with stop bits. (The defined templates, allowing at most one FP op per bundle, made FP compute intensive targeting design more difficult because of the code bloat (a three-FP-op-per-cycle design would require decoding three bundles per cycle). Template constraints probably also introduced a significant number of nops even for "balanced" code.)

> In the Mill operations are bound to execution units statically, like in VLIW, unlike EPIC.

Are you sure about that? I know the Mill does not allow oversubscription (Itanium allows structural hazards that hardware must handle), but I had the impression that the instruction encoding did not specify the operation routing as much as traditional VLIWs did (where each operation slot in an instruction word specified an execution lane). I seem to recall that operations were grouped not by type but by length to facilitate decode of a large number of operations while providing decent code density.

This distinction is not particularly significant with pipelined execution units with similar functionality, but it could be used to support concurrent execution of machine checking software where the machine checking software rotates through the execution units. (In a strict sense expand toward the middle two-thread SMT would violate this since one thread would have lanes inverted. As currently conceived, the Mill could not use this form of SMT because interlocks are intended to be expensive and variable latency results would introduce unintended complications; this form of SMT depends on the ability to stall one thread under easily detected oversubscription.)

[snip]
> > By my definition Mill certainly is static.
> > But by your own super-strict definition, Mill is not 100% static, since pick-up loads are interlocked.

All loads are interlocked since they may not meet their declared latency schedule due to cache misses.

Since loads are interlocked, it would also be possible to have variable latency L1 cache accesses. For latency-critical loads, the compiler could choose the best possible case (correct way prediction/way memoization hit, sub-block/block-based NUCA fast case, no bank conflicts, etc.) while other loads might be scheduled with bad L1 hit case latency (e.g., to allow more critical bank-conflicting loads to proceed). I doubt any early implementations of the Mill will support variable L1 latency and that the specializer (final compiler stage) will exploit such variability.

> When the interlocking locks the (practically) whole processor pipeline, I consider that still fully static.
> When clock cycles are just lost, the relative cycles between all the instructions are still the same.

While the Mill is architecturally fully static by your definition, I do not think that limited out-of-order completion would be excessively difficult to implement (the operand naming is straightforward and use of a renaming feature (to support moves and function calls) is already intended. (Function calls also are intended to support lazy save, if I recall correctly, exploiting a larger belt storage than architecturally defined.) Early-out multiplication for address generation seems likely to be "easy" since the variability in multiply latency could be folded into the architecturally supported variability in load latency. Early branch resolution seems possible (though perhaps not especially useful) and modestly delayed branch resolution might not be excessively difficult (buffering results for an extra cycle or two).

In theory, a special None/Not-a-Thing could be used to generate a replay exception for overly optimistic schedules. Fast replay might be supported to facilitate handling of transient errors. (This is contrary to the philosophy behind the Mill, but it is a lesser (and less useful) concession than full interlocks/out-of-order completion.) If a delayed result is not used "non-speculatively" before it is actually available, a replay would not be triggered. (Early triggering of a replay may be desirable. With short branches "predicated" using a conditional selection instruction, a significant number of operation results may be unused. In addition, as mentioned before, results used for loads, stores, and possibly branches could be delayed with possibly tolerable additional complexity.)

With the reasonably flexible software distribution format, some changes in design philosophy would be practical (i.e., not requiring information unavailable in the distribution format). In addition, the architectural availability of an instruction metadata side channel (preserved across boot-ups) might facilitate the specializer performing optimizations not practical for a purely static design.

I am not a fan of the Mill (I think that a more dynamic orientation is better and that more explicitly limited/directed (versus the typical implicit any-to-any) communication is desirable), but I could see it being a useful architecture when "near OoOE" performance for general purpose code is desired while the main targeted workload has significant ILP (or pseudo-ILP where delayed filtering predication is better than prediction or selective execution). If the target also benefits substantially from specialized functionality either within the general purpose processor or as accelerators, the less custom design orientation might be less of a penalty.

(I also think that exclusively being a chip vendor is a worse business strategy than selling intellectual content as well as chips. For the kinds of chips likely to be targeted, there would probably not be much danger of competition for content licensees (or perceived risk of future competition) and the extra income could be useful. (I seem to recall that early MIPS had difficulty being a merchant chip and system vendor; supporting a competitor, even if it makes financial sense, is difficult to accept.) Focusing on chips first makes sense to prove the concept and establish a presence, but I get the impression that there may be a persistent bias toward this potentially more profitable market, letting some potential profit escape. Note: I am even less of a business analyst than a hardware designer!)
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Independent micro threadsBrett2018/12/30 01:55 PM
  Independent micro threadsTravis Downs2018/12/30 04:18 PM
    Independent micro threadsBrett2018/12/30 07:16 PM
      Independent micro threadsTravis Downs2018/12/30 07:34 PM
        Independent micro threadsBrett2018/12/30 07:48 PM
          Independent micro threadsTravis Downs2018/12/30 09:06 PM
            Independent micro threadsBrett2018/12/30 10:57 PM
              Independent micro threadsTravis Downs2018/12/31 12:42 AM
                Independent micro threadsBrett2019/01/01 07:02 PM
                  Independent micro threadsMichael S2019/01/02 02:01 AM
                    Independent micro threadsMaynard Handley2019/01/02 03:29 PM
              Independent micro threadsMontaray Jack2019/01/01 02:12 AM
  Independent micro threadsanon2018/12/30 04:34 PM
  Independent micro threadsPaul A. Clayton2018/12/30 05:15 PM
  Independent micro threadsDavid Hess2018/12/30 06:47 PM
    Independent micro threadsDomaldel2018/12/30 07:06 PM
      Independent micro threadsDavid Hess2018/12/30 07:26 PM
        Independent micro threadsDoug S2018/12/31 10:26 AM
          Independent micro threadsDavid Hess2018/12/31 09:32 PM
            Independent micro threadsDoug S2019/01/01 12:40 AM
              Independent micro threadsDavid Hess2019/01/01 11:41 AM
            Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Domaldel2019/01/01 12:40 AM
              Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Domaldel2019/01/01 12:44 AM
                Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Domaldel2019/01/01 12:49 AM
                  Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Domaldel2019/01/01 12:51 AM
              Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Simon Farnsworth2019/01/01 06:05 AM
                Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Domaldel2019/01/01 08:01 AM
                Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")Maynard Handley2019/01/01 01:17 PM
                  Taking things to extremes.Domaldel2019/01/01 05:44 PM
                    Yes, I know, the forum is named *Real* World Tech, but I'm thinking that perhapsDomaldel2019/01/01 05:52 PM
                      Yes, I know, the forum is named *Real* World Tech, but I'm thinking that perhapsMontaray Jack2019/01/02 07:26 AM
                        Yes, I know, the forum is named *Real* World Tech, but I'm thinking that perhapsMontaray Jack2019/01/02 08:21 AM
                    Taking things to extremes.Maynard Handley2019/01/01 06:55 PM
                    Taking things to extremes.Kevin G2019/01/04 08:57 AM
              Overcomming thermal limits of a high dencity 3D arcitecture (Formerly "Independent micro threads")David Hess2019/01/01 11:36 AM
    Independent micro threadsTravis Downs2018/12/30 07:38 PM
    Independent micro threadsBrett2018/12/30 07:41 PM
  Independent micro threadsanon2018/12/30 08:20 PM
    Independent micro threadsBrett2018/12/30 08:51 PM
      Independent micro threadsTravis Downs2018/12/30 09:48 PM
        Mill and Independent micro threadsBrett2019/01/01 07:39 PM
          No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Heikki Kultala2019/01/02 12:29 AM
            No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Brett2019/01/02 01:15 AM
              No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Heikki Kultala2019/01/02 02:22 AM
                No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Brett2019/01/03 01:13 AM
                  "Leaf branch" is not a commonly used termHeikki Kultala2019/01/03 03:48 AM
                    "Leaf branch" is not a commonly used termBrett2019/01/03 04:35 AM
                      You lack parallelism than OoOE givesHeikki Kultala2019/01/03 07:38 AM
                        You lack parallelism than OoOE givesBrett2019/01/04 02:41 AM
                          You lack parallelism than OoOE givesBrett2019/01/04 04:10 PM
                            You lack parallelism than OoOE givesBrett2019/01/05 08:29 PM
                              Mill speculates, more parallelism than OoOE givesBrett2019/01/05 08:31 PM
                                Mill *is* a speculationEric Bron2019/01/06 05:04 AM
                                  Mill *is* a speculationMichael S2019/01/06 05:53 AM
                                    Mill *is* a speculationBrett2019/01/06 09:03 PM
                                  Mill *is* a speculationjuanrga2019/01/06 06:10 AM
                                    probably ~2 (NT)Michael S2019/01/06 06:51 AM
                                  Mill *is* a speculationBrett2019/01/06 01:18 PM
                                    Mill *is* a speculationEric Bron2019/01/06 03:36 PM
                                      Mill *is* a speculationBrett2019/01/06 08:47 PM
                                        Mill *is* a speculationJacob Marley2019/01/06 10:29 PM
                                          Mill *is* a speculationBrett2019/01/07 04:24 AM
                                            Mill *is* a speculationMichael S2019/01/07 05:23 AM
                                            Mill *is* a speculationEric Bron2019/01/07 05:36 AM
                                              Mill *is* a speculationBrett2019/01/07 03:40 PM
                                                Mill *is* a speculationEric Bron2019/01/07 05:32 PM
                                            Mill is something you don't understandHeikki Kultala2019/01/08 04:19 AM
                                              Mill is something you don't understandMichael S2019/01/08 07:44 AM
                                                Itanium and static vs dynamicHeikki Kultala2019/01/09 03:14 AM
                                                  Itanium and static vs dynamicPaul A. Clayton2019/01/09 08:51 AM
                                        Mill *is* a speculationEric Bron2019/01/07 05:27 AM
                                          Mill *is* a speculationEric Bron2019/01/07 06:23 AM
                                      Mill *is* a speculationanon2019/01/07 06:24 AM
                                        Mill *is* a speculationEric Bron2019/01/07 06:52 AM
                                          Mill *is* a speculationanon2019/01/07 08:36 AM
                                            Mill *is* a speculationEric Bron2019/01/07 09:20 AM
                                        Mill *is* a speculationjuanrga2019/01/07 10:22 AM
                                          Mill *is* a speculationanon2019/01/07 01:16 PM
                                        Mill *is* a speculationanon2019/01/07 09:46 PM
                                          Mill *is* a speculationanon2019/01/08 01:56 AM
                                            Mill *is* a speculationanon2019/01/08 03:39 AM
                                              Mill *is* a speculationMichael S2019/01/08 03:52 AM
                                                Mill *is* a speculationanon2019/01/08 10:10 PM
                                                  Wasted width not wasted work.Brett2019/01/09 11:44 AM
                                                    No such thing was declared. (NT)anon2019/01/09 03:41 PM
                                                    Very simple test for new uarch ideassomeone2019/01/10 07:03 AM
                                                      Very simple test for new uarch ideasdmcq2019/01/10 07:21 AM
                                                        Very simple test for new uarch ideasDoug S2019/01/10 10:01 AM
                                                          Very simple test for new uarch ideasDan Fay2019/01/10 01:13 PM
                                                      Very simple test for new uarch ideasanonymous22019/01/10 11:03 AM
                                                        Very simple test for new uarch ideasAlberto2019/01/10 11:32 AM
                                                      Very simple test for new uarch ideasEtienne2019/01/11 03:03 AM
                                                        Very simple test for new uarch ideasFoo_2019/01/11 04:31 AM
                                                          Very simple test for new uarch ideasEtienne2019/01/11 05:51 AM
                                                            Very simple test for new uarch ideasFoo_2019/01/11 05:53 AM
                                                              Very simple test for new uarch ideasdmcq2019/01/11 06:08 AM
                                                              Very simple test for new uarch ideasEtienne2019/01/11 06:13 AM
                                                                Very simple test for new uarch ideasFoo_2019/01/11 06:54 AM
                                                                  Very simple test for new uarch ideasEtienne2019/01/11 07:32 AM
                                                                    Very simple test for new uarch ideasBrett2019/01/11 10:25 AM
                                                                      Very simple test for new uarch ideasMegol2019/01/12 06:29 AM
                                                                        Very simple test for new uarch ideasMichael S2019/01/12 09:21 AM
                                                                          Word salad AI fundamentaliy brokenBrett2019/01/12 01:59 PM
                                                                          Very simple test for new uarch ideasMegol2019/01/13 11:51 AM
                                              Mill *is* a speculationanon2019/01/08 08:50 AM
                                                Mill *is* a speculationEric Bron2019/01/08 09:03 AM
                                                  Mill *is* a speculationanon2019/01/08 09:21 AM
                      "Leaf branch" is not a commonly used termMichael S2019/01/03 07:57 AM
                        "Leaf branch" is not a commonly used termBrett2019/01/04 03:29 AM
                  Calls are not needed for speculation for mill if there are no side effect,and dont help if there areHeikki Kultala2019/01/08 04:28 AM
              No. Mill does not get the hit because it does not get the benefit even when correctly predicted.anon2019/01/02 03:05 AM
              No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Doug S2019/01/02 11:38 AM
                No. Mill does not get the hit because it does not get the benefit even when correctly predicted.rwessel2019/01/02 05:53 PM
                  No. Mill does not get the hit because it does not get the benefit even when correctly predicted.anon2019/01/02 08:56 PM
                    itanicBrett2019/01/03 12:41 AM
                      itanicanon2019/01/03 03:12 AM
                      itanicDavid Hess2019/01/03 08:06 AM
                    No. Mill does not get the hit because it does not get the benefit even when correctly predicted.rwessel2019/01/03 09:18 AM
                      No. Mill does not get the hit because it does not get the benefit even when correctly predicted.anon2019/01/04 05:25 AM
                    Itanium could have been RISC or CISC - same outcomesomeone2019/01/04 07:45 AM
                      Itanium could have been RISC or CISC - same outcomeDoug S2019/01/04 12:39 PM
                        Itanium could have been RISC or CISC - same outcomeJan Olšan2019/01/04 01:58 PM
                          "fluffyRISC" has a namevvid2019/01/04 03:48 PM
                        Itanium could have been RISC or CISC - same outcomeBrett2019/01/04 03:43 PM
                      Itanium could have been RISC or CISC - same outcomeanonymou52019/01/04 12:41 PM
                  No. Mill does not get the hit because it does not get the benefit even when correctly predicted.David Hess2019/01/03 08:15 AM
                No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Maynard Handley2019/01/03 12:24 PM
                  No. Mill does not get the hit because it does not get the benefit even when correctly predicted.Maynard Handley2019/01/03 12:27 PM
                    No. Mill does not get the hit because it does not get the benefit even when correctly predicted.dmcq2019/01/04 01:59 AM
                  EPIC target marketsFoo_2019/01/04 06:29 AM
                    EPIC target marketsDoug S2019/01/04 12:42 PM
                      Lack of future visionDoug S2019/01/04 12:57 PM
                        Lack of future visionBrett2019/01/04 02:59 PM
                          Lack of future visionDoug S2019/01/04 04:25 PM
                            Lack of future visionBrett2019/01/04 05:18 PM
                              Lack of future visionDoug S2019/01/05 12:47 AM
                                Lack of future visionBrett2019/01/05 02:06 PM
                                  Lack of future visiondmcq2019/01/05 02:22 PM
                                  Lack of future visionanon2019/01/05 03:01 PM
                                    Lack of future visionMichael S2019/01/05 04:18 PM
                                      Lack of future visionanon2019/01/05 06:14 PM
                                        Lack of future visionMichael S2019/01/06 02:01 AM
                                          Lack of future visionanon2019/01/06 03:23 AM
                                          Mitch Alsup's MY66000 uses IF-like predication (I think) (NT)Paul A. Clayton2019/01/06 04:54 PM
                                            ??? (NT)Michael S2019/01/07 05:25 AM
                                            88K ? (NT)anonymous22019/01/07 04:20 PM
                                          Modestly expanded response: MY66000 predicate shadowPaul A. Clayton2019/01/07 11:53 AM
                      Thanks for the correction (NT)Foo_2019/01/04 04:31 PM
              No. Mill does not get the hit because it does not get the benefit even when correctly predicted.sdrc2019/01/04 07:36 AM
          Mill and Independent micro threadsMichael S2019/01/02 02:32 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?