Two examples

By: Mark Roulo (, August 5, 2022 2:20 pm
Room: Moderated Discussions
anonymous2 ( on August 5, 2022 2:05 pm wrote:
> Sorry, not quite.
> Why isn't smaller (-Os) in effect fast (like -O3) from an ISA design/implementation?
> Are the differences we see today the result of ISA/silicon/compiler evolution over time?
> Or is there are fundamental reason why they can't be _mostly_ the same thing?
> Smaller code fits more cache, it's rarely slower than unoptimized code (embedded anecdote).

Inlining small functions can speed up code but often at a cost of more total bytes of object code.

To illustrate: you can write a sort() function that takes a function pointer to provide less-than vs equal and use the same sort() code for many types. This is what C does. Or you can write a templatized sort routine for each *type* and skip the function call through the indirect function pointer. One is more compact, the other is faster.

Loop unrolling often trades larger code for faster implementation (because of less branching and even fewer *tests* of branching).

Aligning function starts and branch targets (to specific sized alignment boundaries) helps with performance, but makes the code larger.

I'm sure there are other examples.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
ISA (x86/armv8) q: Why isn't "gcc -Os" ~ "gcc -O3" ? (NT)anonymous22022/08/05 10:11 AM
  Why isn't "small" static code size the same as fast?Mark Roulo2022/08/05 12:10 PM
    Why isn't "small" static code size the same as fast?anonymous22022/08/05 02:05 PM
      Two examplesMark Roulo2022/08/05 02:20 PM
      Why isn't "small" static code size the same as fast?Andrey2022/08/05 04:08 PM
      Why isn't "small" static code size the same as fast?anon22022/08/05 05:33 PM
Reply to this Topic
Body: No Text
How do you spell tangerine? ūüćä