Proposal for a Friendly Dialect of C

[This post is jointly authored by Pascal Cuoq, Matthew Flatt, and John Regehr.]

In this post, we will assume that you are comfortable with the material in all three parts of John’s undefined behavior writeup and also with all three parts of Chris Lattner’s writeup about undefined behavior. Additionally, this paper is excellent background reading.

C compilers generate faster and smaller code by assuming that the compiled program will never execute an undefined behavior (UB). Each time a compiler exploits a new kind of UB or increases the optimizer’s reach in other ways, some code that previously worked will break. Thus, compiler upgrades cause consistent low-grade headaches for developers and maintainers of large bodies of C code. It’s not hard to find reasonable objections to this kind of thing as well as irate bug reports. The code that gets broken by the UB-aware compiler is incorrect according to the standard, but following all of the rules in the C standard in a large code base is brutally difficult and, practically speaking, few programmers are capable of it. For example, a sufficiently advanced compiler can break six of the nine well-worn C programs in SPEC CINT 2006 by only exploiting integer undefined behaviors. The problem is that the ostensible user base for C — people implementing low-level systems code — is not necessarily well served by creeping undefined-behavior exploitation. In short, modern C is not a friendly programming language.

When developers are not 100% certain that their code is free of undefined behaviors, one thing they do is add compiler-specific flags that disable certain UB-based optimizations. For example, PostgreSQL uses the -fwrapv option, which tells GCC to implement two’s complement wrapping behavior for signed integer overflows. For analogous reasons, the Linux kernel uses -fno-strict-aliasing and -fno-delete-null-pointer-checks. The problem with these sorts of flags is that they are compiler-specific, the flags don’t necessarily mean the same thing across compiler versions, the flags individually don’t provide much protection against UB exploitation, and developers must watch out for new kinds of breakage and new flags to add to configuration scripts.

As Chris Lattner says at the end of his third post on this topic, using various -f flags amounts to selecting a different dialect of C. Instead of having programmers learn, mix, match, and track various -f flags, we propose defining a friendly dialect of C that trades some optimization capability for ease of reasoning. This friendly dialect might be supported through a -std=friendly-c flag (if you’ll indulge the idea that the friendly dialect could be a standard) that merely implies a group of -f flags for a given version of GCC or LLVM. The flag would be otherwise orthogonal to code generation options, such as -O2. Our goal is to combine

  • minimal additional effort for compiler developers by — as much as possible — simply requiring that they provide behaviors that are already present or are at least easily available; with

  • minimal slowdown when compared to maximum UB-aware compiler optimizations by (1) not requiring any UBSan-like dynamic checks to be added and (2) disabling only optimizations that provide a bad value proposition in terms of performance vs. friendliness.

As a starting point, we imagine that friendly C is like the current C standard, but replacing many occurrences of “X has undefined behavior” with “X results in an unspecified value”. That adjustment alone can produce a much friendlier language. In other cases, we may be forced to refer to machine-specific details that are not features of the C abstract machine, and we are OK with that.

Here are some features we propose for friendly C:

  1. The value of a pointer to an object whose lifetime has ended remains the same as it was when the object was alive.
  2. Signed integer overflow results in two’s complement wrapping behavior at the bitwidth of the promoted type.
  3. Shift by negative or shift-past-bitwidth produces an unspecified result.
  4. Reading from an invalid pointer either traps or produces an unspecified value. In particular, all but the most arcane hardware platforms can produce a trap when dereferencing a null pointer, and the compiler should preserve this behavior.
  5. Division-related overflows either produce an unspecified result or else a machine-specific trap occurs.
  6. If possible, we want math- and memory-related traps to be treated as externally visible side-effects that must not be reordered with respect to other externally visible side-effects (much less be assumed to be impossible), but we recognize this may result in significant runtime overhead in some cases.
  7. The result of any signed left-shift is the same as if the left-hand shift argument was cast to unsigned, the shift performed, and the result cast back to the signed type.
  8. A read from uninitialized storage returns an unspecified value.
  9. It is permissible to compute out-of-bounds pointer values including performing pointer arithmetic on the null pointer. This works as if the pointers had been cast to uintptr_t. However, the translation from pointer math to integer math is not completely straightforward since incrementing a pointer by one is equivalent to incrementing the integer-typed variable by the size of the pointed-to type.
  10. The strict aliasing rules simply do not exist: the representations of integers, floating-point values and pointers can be accessed with different types.
  11. A data race results in unspecified behavior. Informally, we expect that the result of a data race is the same as in C99: threads are compiled independently and then data races have a result that is dictated by the details of the underlying scheduler and memory system. Sequentially consistent behavior may not be assumed when data races occur.
  12. memcpy() is implemented by memmove(). Additionally, both functions are no-ops when asked to copy zero bytes, regardless of the validity of their pointer arguments.
  13. The compiler is granted no additional optimization power when it is able to infer that a pointer is invalid. In other words, the compiler is obligated to assume that any pointer might be valid at any time, and to generate code accordingly. The compiler retains the ability to optimize away pointer dereferences that it can prove are redundant or otherwise useless.
  14. When a non-void function returns without returning a value, an unspecified result is returned to the caller.

In the interest of creating a readable blog post, we have kept this discussion informal and have not made any attempt to be comprehensive. If this proposal gains traction, we will work towards an implementable specification that addresses all 203 items listed in Annex J of the C11 standard. A friendly C++ could also be defined but that is a bigger job.

We are not trying to fix the deficiencies of the C language nor making an argument for or against C. Rather, we are trying rescue the predictable little language that we all know is hiding within the C standard. This language generates tight code and doesn’t make you feel like the compiler is your enemy. We want to decrease the rate of bit rot in existing C code and also to reduce the auditing overhead for safety-critical and security-critical C code. The intended audience for -std=friendly-c is people writing low-level systems such as operating systems, embedded systems, and programming language runtimes. These people typically have a good guess about what instructions the compiler will emit for each line of C code they write, and they simply do not want the compiler silently throwing out code. If they need code to be faster, they’ll change how it is written.

Related reading:

We appreciate feedback.

Atomic Accidents

Although I was six years old when the Three Mile Island accident happened, I clearly remember grownups talking about it and being worried: the house my family lived in was only about 60 miles away from the meltdown. In those days there was also plenty of free-floating nuclear angst due to the cold war; this would occasionally condense into something like The Day After or Edge of Darkness. The latter remains one of the best things ever to be shown on television, I re-watch it every couple of years (the 1985 one, not the 2010 one).

James Mahaffey’s Atomic Accidents covers not only Three Mile Island, Chernobyl, and Fukushima, but also pretty much everything else that has gone wrong when humans tried to exploit nuclear fission or fusion. It’s a fascinating book as well as being — perhaps oddly — quite funny, and I had trouble putting it down.

I was surprised to learn how many nuclear reactors have been destroyed on purpose, and I was also surprised to learn how many nuclear weapons were temporarily lost by the US military: something like 60 in total. That’s really scary. But perhaps the most chilling image painted in Atomic Accidents is the criticality accident where a small nuclear reactor is accidentally created, usually by someone working in a fuel processing facility. Imagine doing something innocuous like turning on a stirrer or pouring a liquid into a different container, seeing a bright blue flash, and realizing that you’re dead on your feet. This fascinating report contains a lot of details.

The accidents in large reactor facilities have some depressing common elements. First, the situation is inherently dangerous due to this large system that, under certain conditions, will get into a runaway positive feedback loop. Second, the thing can’t just be shut down to zero power: residual radioactive decay generates heat that has to be gotten rid of, necessitating extraordinarily complex cooling systems and backup power systems behind those. Third, visibility into the operating reactor is often poor: in one early accident, a reactor core had been on fire for several days before this was realized. Finally, humans, caught in between all of these factors, don’t seem to reliably do the right thing at the right instant.

A lot of pop science is written by people whose understanding of the issues seems to be shallow, but that is not the case here: Mahaffey is clearly a real expert on the subject matter. On the other hand, he is not unbiased. For example, on page XIX:

To keep the industry alive, thriving, and growing, it is imperative that the general population not feel threatened by it.

On page XXI:

The purpose of this book is not to convince you that nuclear power is unsafe beyond reason, or that it will lead to the destruction of civilization. On the contrary, I hope to demonstrate that nuclear power is even safer than transportation by steam and may be one of the key things that will allow life on Earth to keep progressing…

The best we can say is that it’s nice that he is up-front about this. Mahaffey’s slanted point of view caused me real stomach trouble only once: by page 33 he has twice asked the question: “Could we eventually evolve into a race that can withstand high levels of radiation?” What? For the human race to evolve in such a fashion, those who cannot withstand high levels of radiation must die — or be sterilized — before they can reproduce, repeatedly, over a period of hundreds or thousands of years. This is what might happen if the entire surface of the earth became dangerously radioactive. What was going on in Mahaffey’s mind that made this disturbing idea seem so appealing that he had to mention it more than once before the end of the first chapter?

Non-Transparent Memory Safety

[This paper contains more detail about the work described in this post.]

Instrumenting C/C++ programs to trap memory safety bugs is a popular and important research topic. In general, a memory safety solution has three goals:

  • efficiency,
  • transparency, and
  • compatibility.

Efficiency is obvious. Transparency means that we can turn on memory safety with a switch, we don’t have to do anything at the program level. Compatibility means that safe and unsafe code can freely interact, especially when linking against libraries. Compatibility is tricky because it severely limits the ways in which we can change the layout of memory objects, as we might hope to do in order to store the length of an array along with its data.

One of my favorite memory safety solutions for C — the Deputy project from Berkeley — is distinct from most other work on this space because it does not have transparency as a goal. While this initially seems like a bad idea, and it will obviously limit the amount of legacy code that we can run under Deputy, I eventually came to realize that non-transparency can be a good thing. The goal of this piece is to explain why.

When you write a C or C++ program, you usually intend it to be memory safe. And in fact, a large proportion of C/C++ code in the wild is memory safe, meaning that for all valid inputs it fails to access out-of-bounds or unallocated storage (or it might mean something else, but let’s not worry about that). The problem, of course, is that a small fraction of C/C++ code is not memory safe and some of these errors have serious consequences.

For sake of argument, let’s say that you have written a piece of C code that is memory safe. With some effort you can do this for a small and perhaps for a medium-sized program. Now we might ask: Why is the program memory safe? Where does the memory safety live? Well, the memory safety resides in the logic of the program and perhaps also in the input domain. Unless we’ve used some sort of formal methods tool, the reasoning behind memory safety isn’t written down anywhere, so it’s impossible to verify.

Let’s take your memory safe C program and run it under a transparent memory safety solution like perhaps SoftBound + CETS. What we have now are two totally separate implementations of memory safety: one of them implicit and hard to get right, the other explicitly enforced by the compiler and runtime system.

Deputy is based on the premise that we don’t need two separate implementations of memory safety. Rather, Deputy is designed in such a way that the C programmer can tell the system just enough about her memory safety implementation that it can be checked. Let’s look at an example:

int lookup (int *array, int index) {
  return array[index];

If we don’t trust the developer to get memory safety right, we need to change the code to something like this:

int lookup (int *array, int index) {
  assert(index >= 0 && index < array.length);
  return array[index];

In the C programmer’s implementation of memory safety, the assertion is guaranteed not to fire by the surrounding program logic and by restrictions on the input domain. In a compatible memory safe C, the assertion must be statically or dynamically checked, meaning that we need to know how many int-typed variables are stored in the memory region starting at array. This is not so easy because C has no runtime representation for array lengths. The typical solution is to maintain some sort of fast lookup structure that maps pointers to lengths. A significant complication is that array might point into the middle of some other array. The code that actually executes would look something like this:

int lookup (int *array, int index) {
  check_read_ok(array + index, sizeof (int));
  return array[index];

Getting back to Deputy, the question is: How can the programmer communicate her memory safety argument to the system? It is done like this:

int lookup (int *COUNT(array.length) array, int index) {
  return array[index];

COUNT() is an annotation that tells Deputy what it needs to know in order to do a fast bounds check — no global lookup structure is necessary.

When I first saw the example above, I was not very impressed: it looks like Deputy is just being lazy and punting the problem back to me. But after using Deputy for a while, its genius became apparent. First, whenever I needed to tell Deputy something, the information was always available either in my head or in a convenient program variable. This is not a coincidence: if the information that Deputy requires is not available, then the code is probably not memory safe. Second, the annotations become incredibly useful documentation: they take memory safety information that is normally implicit and put it out in the open in a nice readable format. In contrast, a transparent memory safety solution is highly valuable at runtime but does not contribute to the understandability and maintainability of our code.

There are a number of other Deputy annotations, most notably NTS which is used to tell the system about a null-terminated string and NONNULL which of course indicates a non-null pointer. The Deputy Quick Reference shows the complete set of annotations and the Deputy Manual explains everything in more detail and has code examples. The Deputy paper focuses on more academic concerns and unfortunately contains only a single short example of Deputized C code.

Although the preceding example didn’t make this clear, applying Deputy to C code is pretty easy because the Deputy compiler uses type inference to figure out annotations within each function. Thus, many simple functions can be annotated at the prototype and the compiler takes care of the rest. In more involved situations, annotations are also necessary inside functions. The process for applying Deputy to legacy C code is to compile the code at which point Deputy says where annotations are missing. So you add them and repeat. It’s a nice process where you end up learning a lot about the code that you are annotating. In general, an incorrect annotation cannot lead to memory-unsafe behavior, but it can cause a memory safety violation to be incorrectly reported. (You can write truly unsafe code in Deputy using its UNSAFE annotation, but at least the unsafe code is obvious, as it is in Rust.) My guess is that people who enjoy using assertions would also enjoy Deputy; people who hate assertions may well have a different opinion.

Is Deputy perfect? Certainly not. Most seriously, it is only a partial memory safety solution and does not address use-after-free errors. Its memory safety guarantee does not hold if there are data races. One time I ran into a case where Deputy wouldn’t let me tell it the information that it needed to know, I believe it was when the size of an array was in a struct field. Finally, since it is based on CIL, Deputy supports C but not C++.

My group used Deputy as the basis for our Safe TinyOS project. TinyOS was a nice match for Deputy: the extremely lightweight runtime was suitable for embedded chips with 4 KB of RAM and the lack of use-after-free checking wasn’t a problem since TinyOS doesn’t have malloc/free. We found that in many cases it was sufficient to annotate the TinyOS interface files — which serve much the same role as C header files — and then Deputy didn’t need additional annotations. Here’s an example of an annotated interface:

 * @param  'message_t* ONE msg'        the received packet
 * @param  'void* COUNT(len) payload'  a pointer to the packet's payload  
 * @param  len                         the length of the data region pointed to by payload
 * @return 'message_t* ONE'            a packet buffer for the stack to use for the next
 *                                     received packet.
event message_t* receive(message_t* msg, void* payload, uint8_t len);

There are minor differences from standard Deputy, such as ONE pointers (they “point to one object”) instead of SAFE NONNULL, and we put the annotations into the comments, so they automatically get added to the interface documentation, instead of putting them directly into the function prototypes. There were also some changes under the hood. We found that Deputy was generally a pleasure to use and it caught some nasty bugs in various TinyOS programs.

The current status is that Deputy has not been supported for some time, so it would not be a good choice for a new project. The Deputy ESOP paper has been well cited (114 times according to Google Scholar) but the basic idea of memory safe C/C++ via annotations and type inference has not caught on, which is kind of a shame since I thought it was a nice design point. On the other hand, even if an updated implementation was available, in 2014 I would perhaps not use Deputy for a new safe low-level project, but would give Rust a try instead, since it has a good story not only for out-of-bounds pointers but also use-after-free errors.

Reviewing Research Papers Efficiently

The conference system that we use in computer science guarantees that several times a year, each of us will need to review a lot of papers, sometimes more than 20, in a fairly short amount of time. In order to focus reviewing energy where it matters most, it helps to review efficiently. Here are some ideas on how to do that.

Significant efficiency can come from recognizing papers that are not deserving of a full review. A paper might fall into this category if it is:

  • way too long
  • obviously outside the scope of the conference
  • significantly incomplete, such as an experimental paper that lacks results
  • a duplicate of a paper submitted or published by the same or different authors
  • an aged resubmission of a many-times-rejected paper that, for example, has not been updated to reference any work done in the last 15 years

These papers can be marked as “reject” and the review then contains a brief, friendly explanation of the problem. If there is controversy about the paper it will be discussed, but the most common outcome is for each reviewer to independently reach the same conclusion, causing the paper to be dropped from consideration early. Certain program committee members actively bid on these papers in order to minimize their amount of reviewing work.

Every paper that passes the quick smoke test has to be read in its entirety. Or perhaps not… I usually skip the abstract of a paper while reviewing it (you would read the abstract when deciding whether or not to read the paper — but here that decision has already been made). Rather, I start out by reading the conclusion. This is helpful for a couple of reasons. First, the conclusion generally lacks the motivational part of the paper which can be superfluous when one is closely familiar with the research area. Second — and there’s no nice way to say this — I’ve found that authors are more truthful when writing conclusions than they are when writing introductions. Perhaps the problem is that the introduction is often written early on, in the hopeful phase of a research project. The conclusion, on the other hand, is generally written during the grim final days — or hours — of paper preparation when the machines have wound down to an idle and the graphs are all plotted. Also, I appreciate the tone of a conclusion, which usually includes some text like: “it has been shown that 41% of hoovulators can be subsumed by frambulators.” This gives us something specific to look for while reading the rest of the paper: evidence supporting that claim. In contrast, the introduction probably spends about a page waxing eloquent on the number of lives that are put at risk every day by the ad hoc and perhaps unsound nature of the hoovulator.

Alas, other than the abstract trick, there aren’t really any good shortcuts during the “reading the paper” phase of reviewing a paper. The next place to save time is on writing the review. The first way to do this is to keep good notes while reading, either in ink on the paper or in a text file. Generally, each such comment will turn into a sentence or two in the final review. Therefore, once you finish reading the paper, your main jobs are (1) to make up your mind about the recommendation and (2) to massage the notes into a legible and useful form. The second way to save time is to decide what kind of review you are writing. If the paper is strong then your review is a persuasive essay with the goal of getting the rest of the committee to accept it. In this case it is also useful to give detailed comments on the presentation: which graphs need to be tweaked, which sentences are awkward, etc. If the paper needs to be rejected, then the purpose of the review is to convince the committee of this and also to help the authors understand where they took a wrong turn. In this case, detailed feedback about the presentation is probably not that useful. Alternatively, many papers at top conferences seem to be a bit borderline, and in this case the job of the reviewer is to provide as much actionable advice as possible to the authors about how to improve the work — this will be useful regardless of whether the paper is accepted or rejected.

I hope it is clear that I am not trying to help reviewers spend less total time reviewing. Rather, by adopting efficient reviewing practices, we can spend our time where it matters most. My observation is that the amount of time that computer scientists spend writing paper reviews varies tremendously. Some people spend almost no time at all whereas others produce reviews that resemble novellas. The amazing people who produce these reviews should embarrass all of us into doing a better job.

Update: Also see Shriram’s excellent notes about reviewing papers.