You can do two 4-cycle loads per cycle

By: Travis Downs (travis.downs.delete@this.gmail.com), September 20, 2018 3:33 pm
Room: Moderated Discussions
Wilco (Wilco.dijkstra.delete@this.ntlworld.com) on September 20, 2018 2:32 am wrote:

> The 4-cycle path might only work within the load/store unit because of timing.
> Forwarding within a unit is faster than from a different unit.
>
> However it's not clear this is the case, you need sequences like alu->load5->load->alu
> to check whether the load->alu latency can ever be 4 cycles.

I wrote two tests, with a 5-cycle load feeding a 4-cycle load then ALU op, and swapping the two loads (both are looped, so the the alu op feeds back into the load).

The results:


load5 -> load4 -> alu 10.00
load4 -> load5 -> alu 11.00


The alu 1 op was one cycle. So it definitely seems to be that the input of the fast load comes from another load, not the output.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
4-cycle L1 latency on Intel not as general as thoughTravis Downs2018/09/17 04:32 PM
  4-cycle L1 latency on Intel not as general as thoughanon2018/09/18 02:43 AM
    4-cycle L1 latency on Intel not as general as thoughtTravis Downs2018/09/18 09:39 AM
      4-cycle L1 latency on Intel not as general as thoughtanon2018/09/18 10:53 AM
        4-cycle L1 latency on Intel not as general as thoughtTravis Downs2018/09/18 11:07 AM
          4-cycle L1 latency on Intel not as general as thoughtanon2018/09/18 11:51 AM
            4-cycle L1 latency on Intel not as general as thoughtTravis Downs2018/09/18 01:52 PM
              4-cycle L1 latency on Intel not as general as thoughtanon2018/09/19 02:40 AM
                4-cycle L1 latency on Intel not as general as thoughtTravis Downs2018/09/19 05:20 PM
                  4-cycle L1 latency on Intel not as general as thoughtSeni2018/09/19 10:28 PM
                    4-cycle L1 latency on Intel not as general as thoughtGabriele Svelto2018/09/20 05:16 AM
                      4-cycle L1 latency on Intel not as general as thoughtTravis Downs2018/09/20 02:25 PM
                        4-cycle L1 latency on Intel not as general as thoughtGabriele Svelto2018/09/21 02:46 AM
                  4-cycle L1 latency on Intel not as general as thoughtanon2018/09/20 08:40 AM
                    4-cycle L1 latency on Intel not as general as thoughtTravis Downs2018/09/20 03:01 PM
    You can do two 4-cycle loads per cycleTravis Downs2018/09/18 10:58 AM
      You can do two 4-cycle loads per cycleanon2018/09/18 11:53 AM
        You can do two 4-cycle loads per cycleTravis Downs2018/09/18 12:29 PM
          You can do two 4-cycle loads per cycleanon2018/09/18 01:27 PM
            You can do two 4-cycle loads per cycleWilco2018/09/18 02:37 PM
              You can do two 4-cycle loads per cycleanon2018/09/19 02:45 AM
                You can do two 4-cycle loads per cycleTravis Downs2018/09/19 05:30 PM
                  You can do two 4-cycle loads per cycleanon2018/09/20 01:34 AM
                    You can do two 4-cycle loads per cycleWilco2018/09/20 02:32 AM
                      You can do two 4-cycle loads per cycleanon2018/09/20 04:35 AM
                      You can do two 4-cycle loads per cycleTravis Downs2018/09/20 03:33 PM
                    You can do two 4-cycle loads per cycleTravis Downs2018/09/20 03:10 PM
            You can do two 4-cycle loads per cycleTravis Downs2018/09/18 03:08 PM
              You can do two 4-cycle loads per cycleGabriele Svelto2018/09/19 01:39 AM
                You can do two 4-cycle loads per cycleTravis Downs2018/09/19 05:43 PM
              You can do two 4-cycle loads per cycleanon2018/09/19 02:42 AM
                You can do two 4-cycle loads per cycleTravis Downs2018/09/19 06:09 PM
                  You can do two 4-cycle loads per cycleanon2018/09/20 01:49 AM
                    You can do two 4-cycle loads per cycleTravis Downs2018/09/20 04:38 PM
                    You can do two 4-cycle loads per cycleTravis Downs2018/09/20 07:27 PM
                      You can do two 4-cycle loads per cycleanon2018/09/21 08:08 AM
            Separate RS for ALU vs load/storeTravis Downs2018/12/13 12:55 PM
              Separate RS for ALU vs load/storeanon2018/12/13 02:14 PM
              Separate RS for ALU vs load/storeanon.12018/12/13 09:15 PM
                Separate RS for ALU vs load/storeWilco2018/12/14 04:41 AM
                  Separate RS for ALU vs load/storeanon.12018/12/14 08:08 AM
                    Separate RS for ALU vs load/storeWilco2018/12/14 01:51 PM
              Integer divide also var latencyDavid Kanter2018/12/14 11:45 AM
                Integer divide also var latencyTravis Downs2018/12/14 09:09 PM
              Separate RS for ALU vs load/storeanon22018/12/14 09:57 PM
                Separate RS for ALU vs load/storeTravis Downs2018/12/15 11:00 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?