Data-dependent instruction latency

By: Peter E. Fry (pfry.delete@this.tailbone.net), August 4, 2018 7:14 am
Room: Moderated Discussions
Travis (travis.downs.delete@this.gmail.com) on August 3, 2018 1:34 pm wrote:
[...]
> There are many instructions that already have different latencies (and uop counts) depending
> on some value, but the existing examples that I know of are all based on immediate values in
> the instruction so can be sorted out by the decoders. Examples include adc with a 0 immediate
> (1 uop vs 2 on SnB to Haswell), certain shift instructions with 0 or 1 immediate, etc.
>
> Is this useful information in practice for optimization?
>
> Probably not, or only very rarely. [...]

Going back a few years, I had a test case on the AMD K8/K10 where MUL had three distinct latencies: one factor = 0 or 1; one factor = power of 2; everything else. I discovered it via some poorly-formed test data (all 0s), which made me think I had some magically fast code. My K10 board is on a shelf in the closet, so I can't check my sanity at the moment (yes, it is suspect). I don't have any later AMD chips to test.

Are these sorts of things documented in one place somewhere?

Not really related, but I've run clean into the limitations of static analysis (staring at code). I have two mysteries (at the moment) (BSF running faster than it should on Haswell; two sets of sequences compiled on GCC and Clang with identical instruction counts that run... differently than I would expect) - apparently performance counters are the only way these days.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
TIL: simple vs complex addressing is resolved at rename time (probably)Travis2018/08/03 01:34 PM
  TIL: simple vs complex addressing is resolved at rename time (probably)foobar2018/08/04 01:40 AM
    TIL: simple vs complex addressing is resolved at rename time (probably)anon2018/08/04 05:05 AM
      TIL: simple vs complex addressing is resolved at rename time (probably)foobar2018/08/04 07:00 AM
        TIL: simple vs complex addressing is resolved at rename time (probably)anon2018/08/04 08:32 AM
          TIL: simple vs complex addressing is resolved at rename time (probably)foobar2018/08/04 09:48 AM
            TIL: simple vs complex addressing is resolved at rename time (probably)anon2018/08/04 10:19 AM
  Data-dependent instruction latencyPeter E. Fry2018/08/04 07:14 AM
    ... or a compiler optimizing aggressively?Heikki Kultala2018/08/04 08:13 AM
      ... or a compiler optimizing aggressively?Peter E. Fry2018/08/04 08:53 AM
    Data-dependent instruction latencyTravis2018/08/04 03:33 PM
      Data-dependent instruction latencyPeter E. Fry2018/08/05 09:13 AM
        Data-dependent instruction latencyTravis2018/08/05 04:55 PM
          Data-dependent instruction latencyPeter E. Fry2018/08/06 07:34 AM
            Data-dependent instruction latencyTravis2018/08/06 05:10 PM
              Data-dependent instruction latencyPeter E. Fry2018/08/07 07:09 AM
                Data-dependent instruction latencyPeter E. Fry2018/08/07 07:11 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?