Non-deterministic Compilers?


Compilers are supposed to be deterministic, and they generally are. However, when a compiler has memory safety bugs (use of free memory, usually) and runs on an OS with ASLR (address space layout randomization) enabled, it can behave non-deterministically. Compile a file one time and the result is correct, compile the exact same file again and now the executable crashes or generates the wrong answer. This may sound like a silly case that happens only in theory, but while hunting for compiler bugs I’ve seen it happen a number of times.

The scary scenario is this:

  1. Safety critical embedded code is compiled for testing, and happens to be compiled correctly.
  2. Testing proceeds and finds no problems.
  3. For whatever reason, the system is compiled again and this time the wrong code is generated.
  4. Wrong code is deployed.

Unlikely? Sure! But so are a lot of other things people developing safety critical systems have to worry about. If I were developing safety critical code I’d turn off ASLR for development tools, it’s cheap insurance.


7 responses to “Non-deterministic Compilers?”

  1. You’re definitely right in saying ASLR is a feature that should be turned off. It’s certainly a feature that is disallowed in spirit when you follow a safety-critical software development guideline such as DO-178B and IEC 61508 part-3.

    In most safety-critical code, the address map is fixed at design time.

    A good example of that is IMA (Integrated Modular Avionics) OSes that follow ARINC-653. With that standard, you have a system blueprint which is basically a static configuration of all resources used in the system, including the memory map of all binaries. Everything must be pre-configured explicitly.

    The memory map is static and the linker scripts are configured for that address map at each binary linking. Code may be compiled position-independent, but it will always end-up in the same place. This is required because of determinism issues at the system level:
    – If you know the address map, you know the cache lines and memory banks that will be used, thereby facilitating WCET and hardware failure-more analysis.
    – When an error is logged, all addresses logged have a direct 1:1 mapping to a specific item, thereby allowing easier determination of error causes.
    – When debugging with probes, there is no problem related to different virtual address contexts. In many cases where an MMU is used, the virtual address is equal to the physical address, as the MMU is only used for memory protection, not address space extension.

  2. How many times would you need to compile it before a simple diff (excluding timestamps, etc.) would give you reasonable confidence your code isn’t tripping one of these? Are we talking different every time or 1 in 1000?

  3. Hi BCS- It depends on the bug. Most of the ones I’ve noticed have been not far from 50/50, but that’s what we’d expect since we never look for these bugs specifically, so the rare ones are going to slip past. It stands to reason that there are cases where ASLR only rarely exposes a problem, or only rarely hides it, due to quirks of the memory layout. I’m not sure how to look for these. I think turning off ASLR or (in the long run) implementing compilers in a memory-safe language is the right fix. GCC already uses garbage collection, so maybe this is not so far off.

  4. Hi Tennessee- Let’s be careful to distinguish between the build platform and the embedded platform. Definitely the memory map for an embedded platform wants to be fixed (unless it’s a cell phone or similar where ASLR might add value). But in this post I’m talking about ASLR on the build platform — the workstation running the development tools. I don’t know 178B and the related standards that well, but I don’t believe they would lead us to abandon ASLR on the build platform, right?

  5. John- Oh. I guess the scenario you outlined threw me off (ie: sounded like we should not use ASLR in embedded code, even though you say the compiler’s OS in your first sentence). It did not click that the problem was that ASLR on the BUILD platform caused non-deterministic compiles ! That is actually pretty scary.

  6. Are compilers required to be deterministic? Since there are many correct outputs, is the compiler free to choose one? Furthermore, may the compiler use genuine randomness, or is it required to use pseudorandomness with a user-specified seed?

  7. Hi David- As far as I know, there’s nothing in the C standard that requires a compiler to be deterministic. However, most compiler developers would want determinism as a quality of implementation issue, in the same way that they want to provide useful diagnostics to developers.