Why not initialize all variables to zero?

By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), March 28, 2020 9:40 am
Room: Moderated Discussions
Foo_ (foo.delete@this.nomail.com) on March 28, 2020 2:34 am wrote:
>
> Right. But at least it probably makes bugs easier to detect and diagnose, since behaviour becomes deterministic
> (your uninitialized variable is always zeroed, rather than getting a runtime-dependent value).

Correct.

We've not taken that step yet for the kernel, but it's actually one of the more palatable "hardening" options you have when you care about security.

The "but initializing will hide bugs" is one of those noises that people make without any real thought. It's become a mindless talking point, rather than a real argument. It's BS, in other words.

The reason C doesn't initialize local variables by default has nothing to do with debugging, and anybody who tells you differently is either lying or woefully uneducated.

The reason C - and languages influenced by it - doesn't do it is because it can be expensive, particularly if you have a bad compiler (and remember: all compilers used to be bad by moderns standards). But also for certain common coding practices, like having medium-sized arrays (or large structures) on the stack.

But in most cases, modern compilers will see and warn about uninitialized local variables. The dangerous cases are the ones where the compiler can't see it because the use is too complicated (well, "complicated" may be the wrong word: the common case is that you pass off an uninitialized buffer to another function that is supposed to initialize it).

But those complex cases are also the ones that humans have most problems with, and that aren't really generally worth optimizing for, so zero-filling those cases automatically is probably a good idea, unless you have some odd code that doesn't care about security at all, and cares deeply about performance. And these days, that really should be seen as fairly unusual code.

That means that with a modern competent compiler, the "initialize local variables to zero" is most often a complete no-op - because the compiler obviously only needs to do so when it doesn't see the real initialization.

So zeroing uninitialized local variables used to be expensive, but really isn't all that expensive any more, most of the time.

And the "but but debug" argument really is pure and utter garbage. Not clearing the stack variables just means that now you have unexpected behavior, and your debug builds may well have very different behavior from the ones you ship, or the ones you do regression testing on with extra code or with special flags. Those unexplained cases, where you as a developer can't see the bug (because maybe you have a compiler version or option that happens to just re-use a stack slot that was zero anyway - zero being one of the most common values in memory).

So if you care about debuggability, not initializing variables is just about the worst thing you can do. It's worse than zeroing them, but it's most definitely also worse than initializing them with some other random pattern.

Using a special pattern to initialize things is probably the best option if all you really care about is debugging.

At the same time, zero-initializing does have real advantages too. It can make the code simpler and more legible, simply because the programmer doesn't need to spell it out. That, in turn, can help avoid the bugs in the first place. But that only works when the language specifies the behavior, of course (like C does for static allocations, for example).

Again, the argument that zeroing variables hides bugs is complete garbage. Leaving random contents makes debugging harder, makes writing code harder, and most definitely does not help debugging at all. It only helps debugging if you have the "real men should always initialize everything by hand, and anything else is a bug that you deserve" kind of mentality, and think that debugging random behavior makes your chest hairs grow.

And that mentality makes no sense.

So I do believe that not clearing local variables by default is a bug in the C standard, but I also happen to believe that it's one that makes lots of historical sense, and so it's one of those things that you just have to live with.

In a perfect world, I think you'd have clearing by default, and then the option to manually override it for specific cases if you find that it's a performance problem and you can say "Yes, I pinky promise that I will initialize this myself" by annotating the declaration (or the allocation in case of malloc and friends).

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Why not initialize all variables to zero?Doug S2020/03/26 12:13 PM
  Most (all?) modern programming languages do this, right?Mark Roulo2020/03/26 04:44 PM
    Most (all?) modern programming languages do this, right?Konrad Scharz2020/03/27 01:09 AM
      Most (all?) modern programming languages do this, right?Gionatan Danti2020/03/27 03:13 AM
        Most (all?) modern programming languages do this, right?Foo_2020/03/27 03:45 AM
          Most (all?) modern programming languages do this, right?Gionatan Danti2020/03/27 09:19 AM
            Most (all?) modern programming languages do this, right?Foo_2020/03/28 02:32 AM
              Most (all?) modern programming languages do this, right?Montaray Jack2020/03/28 10:26 AM
      Most (all?) modern programming languages do this, right?Jeff S.2020/03/27 07:22 AM
        Most (all?) modern programming languages do this, right?anonymou52020/03/27 01:08 PM
  Why not initialize all variables to zero?Etienne2020/03/27 01:56 AM
    Why not initialize all variables to zero?NoSpammer2020/03/27 02:31 AM
  Why not initialize all variables to zero?Carlie Coats2020/03/27 06:17 AM
    Why not initialize all variables to zero?Jukka Larja2020/03/27 10:14 PM
      Why not initialize all variables to zero?Anon2020/03/28 12:01 AM
        Why not initialize all variables to zero?Jukka Larja2020/03/28 08:25 AM
          Why not initialize all variables to zero?Anon2020/03/28 11:20 AM
            Why not initialize all variables to zero?Jukka Larja2020/03/28 11:45 AM
              Why not initialize all variables to zero?Anon2020/03/28 01:21 PM
                Why not initialize all variables to zero?Jukka Larja2020/03/28 09:49 PM
    Why not initialize all variables to zero?Doug S2020/03/28 11:27 AM
      Why not initialize all variables to zero?Anon2020/03/28 01:24 PM
        Why not initialize all variables to zero?Carlie Coats2020/03/29 06:56 AM
  Why not initialize all variables to zero?Gabriele Svelto2020/03/27 06:52 AM
    Why not initialize all variables to zero?Foo_2020/03/28 02:34 AM
      Why not initialize all variables to zero?Linus Torvalds2020/03/28 09:40 AM
        Why not initialize all variables to zero?Doug S2020/03/28 11:21 AM
          Why not initialize all variables to zero?Linus Torvalds2020/03/28 01:01 PM
            Why not initialize all variables to zero?Etienne2020/04/02 01:14 AM
              Why not initialize all variables to zero?gallier22020/04/02 05:41 AM
              Why not initialize all variables to zero?Doug S2020/04/02 09:51 AM
        Why not initialize all variables to zero?Gabriele Svelto2020/03/28 01:46 PM
          Why not initialize all variables to zero?Linus Torvalds2020/03/28 04:28 PM
            Why not initialize all variables to zero?Anon32020/03/29 04:23 AM
            Why not initialize all variables to zero?Gabriele Svelto2020/03/29 12:28 PM
              Why not initialize all variables to zero?Anon32020/03/29 01:05 PM
                Why not initialize all variables to zero?Gabriele Svelto2020/03/30 12:52 AM
        Why not initialize all variables to zero?Carlie Coats2020/03/29 07:03 AM
        Why not initialize all variables to zero?gallier22020/03/29 11:48 PM
          Why not initialize all variables to zero?Michael S2020/03/30 02:24 AM
            Why not initialize all variables to zero?gallier22020/03/30 03:11 AM
        Why not discard variables after last use?2020/03/31 08:02 AM
          Makes no sense at allHeikki Kultala2020/03/31 01:01 PM
            An example (maybe)Mark Roulo2020/03/31 04:07 PM
              An example (maybe)Doug S2020/04/01 11:01 AM
                An example (maybe)Simon Farnsworth2020/04/02 02:21 AM
            Why not discard variables after last use?2020/04/02 12:41 PM
    Why not initialize all variables to zero?j2020/03/28 09:16 AM
    Why not initialize all variables to zero?Montaray Jack2020/03/28 11:42 AM
  Why not initialize all variables to zero?blaine2020/03/27 01:23 PM
    Why not initialize all variables to zero?James2020/03/28 03:18 AM
      Why not initialize all variables to zero?Anon32020/03/28 05:14 AM
      Why not initialize all variables to zero?Doug S2020/03/28 11:32 AM
        Why not initialize all variables to zero?Anon32020/03/28 11:45 AM
    Why not initialize all variables to zero?gallier22020/03/30 12:03 AM
  Why not initialize all variables to zero?gallier22020/03/29 11:32 PM
    Why not initialize all variables to zero?Michael S2020/03/30 02:30 AM
      Why not initialize all variables to zero?gallier22020/03/30 03:45 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?