Performance "speed limits"

By: Paul A. Clayton (, June 11, 2019 5:04 am
Room: Moderated Discussions
Travis Downs ( on June 11, 2019 1:23 am wrote:
> I wrote something about performance speed limits, which is basically a list of things
> that might limit your code (mostly loops) to a specific level of performance. It's
> quantitative, in that it tells you exactly how many iterations/cycle you'll get if
> you hit a particular limit. I've used it in practice and find it effective.
> Have a read if it interests you. Feedback is welcome - I don't have a comments
> system set up* but you could reply here or open an issue on github.

Thank you, that was fun reading. Although it is not a bottleneck, mentioning branch prediction accuracy might have been appropriate (probably in the "Out of Order Limits" section). Also, technically, "prefetcher friendly access patterns" would not apply to "Memory and Cache Bandwidth" but to "Out of Order Limits" since prefetching does not reduce demand bandwidth (though DRAM page access clustering can increase achieved bandwidth) but avoids stalls related to window size.

I admire (and am a bit jealous of) people who have such knowledge and take the time and effort to communicate it, especially when they do so clearly. (Your posts here have also been insightful.)

I did notice some typos (emphasis added):

"and doens’t even do a great job" should be "and doesn't even do a great job",

"the loaded value in a register acros iterations" should be "the loaded value in a register across iterations",

"Has reached end of like but can still be downloaded" should be "Has reached end of life but can still be downloaded",

"each instruction evently among the ports it can use and doens’t look" should be "each instruction evenly among the ports it can use and doesn’t look",

"an rare case where memory source" should be "a rare case where memory source",

"Memory and Cache Bandwith" should be "Memory and Cache Bandwidth",

"your accesses go a mix of cache levels: you will probably" should be "your accesses go toa mix of cache levels, you will probably",

"if the calculate the speed limit based on the assumption the cache levels can be accessed" should be "if you calculated the speed limit based on the assumption the cache levels can be accessed" (you might also want to add "that" between "assumption" and "the cache"),

"allows use to do the multiplications" should be "allows us to do the multiplications",

"like important imporant values" should be "like important values",

"Certain patterns will result have worse" should be something like "As a result, certain patterns will have worse",

"faster than 4/cycles per element" should be "faster than 4 cycles per element",

"you wan to take advantaged of vectorized" should be "you want to take advantaged of vectorized",

"with a vectorized stores" should be "with a vectorized store",

"but only unpsecified" should be "but only unspecified",

"use it to directly an upper bound" should be "use it directly to establish an upper bound",

"the will CPU mostly not be able" should be "the CPU will mostly not be able",

"want to move expesnive instructions" should be "want to move expensive instructions", and

"luck emaulting it" should be "luck emulating it".

In addition, here are three phrasing suggestions:

"attempts to solve for an ideal solution" might be slightly better as "attempts to find an ideal solution",

"These might look like important values. I even made a table, probably the only table in this whole post." might be a little better as "These might look like important values; I even made a table." (Technically, the image from Agner Fog's book was a table.), and

"change wrt well-compiled code" might be better as "change with respect to well-compiled code".
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Performance "speed limits"Travis Downs2019/06/11 01:23 AM
  Performance "speed limits"Adrian2019/06/11 03:52 AM
    Performance "speed limits"Travis Downs2019/06/11 09:28 AM
  Performance "speed limits"Paul A. Clayton2019/06/11 05:04 AM
    correction of my corrections!Paul A. Clayton2019/06/11 05:07 AM
    Performance "speed limits"Peter E. Fry2019/06/11 07:19 AM
      Performance "speed limits"Travis Downs2019/06/11 09:36 AM
    Performance "speed limits"Travis Downs2019/06/11 09:26 AM
  Performance "speed limits"Branches2019/06/11 08:04 AM
  Performance "speed limits"anon2019/06/11 07:06 PM
    Performance "speed limits"Travis Downs2019/06/11 07:12 PM
      Thank you, very nice writeup (NT)anon2019/06/11 07:37 PM
  Performance "speed limits"anon2019/06/11 07:34 PM
    Performance "speed limits"Maynard Handley2019/06/12 10:13 PM
    Performance "speed limits"Travis Downs2019/06/13 01:05 PM
Reply to this Topic
Body: No Text
How do you spell purple?