ISA support for constant count loops: ineffective compared to micro-threads

By: (0xe2.0x9a.0x9b.delete@this.gmail.com), January 22, 2020 9:18 am
Room: Moderated Discussions
Etienne (etienne_lorrain.delete@this.yahoo.fr) on January 21, 2020 1:42 am wrote:
> ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on January 20, 2020 7:02 am wrote:
> > For example: Add to x86 CPUs instructions to start in a single thread 16 iterations of a particular loop in
> > parallel with hardware support for data dependency resolution
> > across the 16 iterations in order to make those
> > 16 iterations look like they executed sequentially
>
> So you want to dynamically switch your latency optimised processor to a streaming processor at run
> time, starting to fetch 16 times the first parameter of the loop before executing the second instruction
> of the loop

It is possible to fetch 16 times the first parameter of the loop only if the CPU can prove that all of the first 15 iterations do not modify it.

> - but switch back to a latency optimised processor if there is a page fault?

Preventing execution of memory fetches in the last 10 instructions in an ordered set of concurrently executed 16 memory instructions in a pipeline stage that is before the memory fetch stage is trivial when the 9-th instruction generates a page fault.

> Or do you want to connect 16 processors to the same layer 1 memory cache to stop memory lines bouncing
> in between processors with would result in a major slow-down compared to the linear treatment.

Registers do not "bounce around" cores.

It is possible to start with 4 L1D cache load ports per core. There is no need to start with 32 L1D data cache load ports in the implementation.

> Note that with Out Of Order processor you may already treat few iteration in parallel right now...

Only if you can statically (at compile-time) prove that the few iterations are data independent.

> IHMO if the current processor are too slow, one should refine
> the "too slow" and explain the cases where it is.
> - Maybe the software being used is really badly designed for current architectures, and should be optimised

This is always true of all software not executing on custom-designed task-specific silicon.

> in some ways (reduce number of adaptation layers, reduce just-in-time compilations, reduce number of
> times the same functions are called with the same arguments and leads to the same result..).

You can assume that such optimizations have already been applied before the CPU sees the user instructions.

> - Maybe the problem is highly parallel and two very close processors (one
> latency optimised, one streaming optimised) should be connected in a simple
> and efficient manner, with easy to use tools - maybe even the same ISA.

Linux kernel isn't prepared to take advantage of heterogeneous x86 CPUs.

> - Maybe the problem is extremely specific and needs coprocessor, preferably usable legally at a
> price the final user accepts to pay (usually $0.00, including hardware) like 4K MPEG display.

$0.00 is not possible.

> - Maybe the problem is memory search (databases?) and some special devices to scan DRAM without
> trashing memory caches, or do complex DMA with on the flight treatment would be helpful.

Persuade AMD and Intel to put it in x86 CPUs and RAMs.

> - Maybe that intelligent DMA can use a programmable processor with
> local 1 cycle memory and extended read-write bus controls?
> - Maybe the problem is lack of useable memory at some random times, and to solve the problem for
> a price the user accept to pay (strictly $0.00 including hardware), one should overallocate.

$0.00 is not possible.

-atom
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
LLVM comments on mem*Maynard Handley2020/01/14 01:51 PM
  LLVM comments on mem*Anon32020/01/15 06:28 AM
  Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/15 07:12 AM
    Interesting comment about rep instructions & code sizenone2020/01/15 08:59 AM
      Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:56 AM
        Interesting comment about rep instructions & code sizeLinus Torvalds2020/01/16 10:12 AM
          ISA support for constant count loopsPaul A. Clayton2020/01/16 11:28 AM
            ISA support for constant count loopsGabriele Svelto2020/01/16 02:15 PM
              PowerPC "front-end registers"Paul A. Clayton2020/01/16 03:34 PM
              ISA support for constant count loopsTravis Downs2020/01/16 05:21 PM
                ISA support for constant count loopsLinus Torvalds2020/01/16 08:41 PM
                  ISA support for constant count loopsTravis2020/01/16 09:48 PM
                    ISA support for constant count loopsBrett2020/01/17 01:28 AM
              Branch to CTRMaya2020/01/18 08:15 AM
                Branch to CTRGabriele Svelto2020/01/18 01:14 PM
            ISA support for constant count loopsanon2020/01/17 08:28 AM
              ISA support for constant count loopsTravis Downs2020/01/17 08:34 AM
            ISA support for constant count loops: ineffective compared to micro-threads2020/01/20 08:02 AM
              ISA support for constant count loops: ineffective compared to micro-threadssomeone2020/01/20 12:23 PM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:23 AM
              ISA support for too slow computersEtienne2020/01/21 02:42 AM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:18 AM
                  ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 10:04 AM
                  ISA support for constant count loops: ineffective compared to micro-threadsHeikki Kultala2020/01/22 10:47 AM
                    ISA support for constant count loops: ineffective compared to micro-threadsdmcq2020/01/22 01:31 PM
                    ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 03:28 PM
                      ISA support for constant count loops: ineffective compared to micro-threadsEtienne2020/01/22 04:35 PM
          Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 02:00 PM
    Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 03:40 PM
      Interesting comment about rep instructions & code sizeChester2020/01/15 05:16 PM
        Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 05:50 PM
          Interesting comment about rep instructions & code sizeChester2020/01/15 07:24 PM
            Interesting comment about rep instructions & code sizeTravis Downs2020/01/16 02:26 PM
              Interesting comment about rep instructions & code sizeChester2020/01/17 01:16 PM
                Interesting comment about rep instructions & code sizeTravis Downs2020/01/17 03:41 PM
        Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:53 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?