Agner Manual Updated for Zen 3

By: Chester (lamchester.delete@this.gmail.com), February 3, 2021 1:33 am
Room: Moderated Discussions
Manual here. Notable improvements from Zen 2:
- Renamer can now handle 6 instructions/c. Zen 2 could only do 5 instructions/c or 6 uops/c.
- FMA latency reduced to 4 clocks. Zen 2 had 5c FMA latency
- cmp/test/add/sub/and/or/xor/inc/dec can fuse with a conditional jump. Zen 2 could only fuse cmp/test
- Fetch bandwidth is 16 bytes per thread in SMT mode, so 32 bytes per cycle. Agner only measured 24 bytes/c fetch bw on Zen 2
- Loop counts up to 64 can be perfectly predicted (up from 12? in zen 2)

Notable things that got cut:
- Throughput is lower for dense taken branches. Zen 3 gets reduced branch throughput if there are more than 2 taken branches per 16 bytes of code.
- Agner measured 14c latency on the L2 even though AMD's optimization manual says L2 latency stays at 12c
- Failed store forwarding penalty is higher at 10c, vs 6-7c on Zen 2
- Zen 2 memory renaming/mirroring ability is gone
 Next Post in Thread >
TopicPosted ByDate
Agner Manual Updated for Zen 3Chester2021/02/03 01:33 AM
  Agner Manual Updated for Zen 3Gionatan Danti2021/02/03 05:06 AM
  Agner Manual Updated for Zen 3Linus Torvalds2021/02/03 11:13 AM
    Agner Manual Updated for Zen 3Gionatan Danti2021/02/03 01:32 PM
      Agner Manual Updated for Zen 3Chester2021/02/05 03:32 AM
        Agner Manual Updated for Zen 3Andrey2021/02/05 03:56 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?