Happy Canyon

I’ve been doing a poor job of taking pictures in Europe. On the other hand, I’ve had a trip report on the back burner since last spring, so let’s look at a few pictures from that.

Happy Canyon, in a remote part of southeast Utah, has a scenic and non-technical narrow section that would be famous if it were easier to get to. There are about five ways to get there, but each has a catch: a very long hike including a rappel, a multi-day hike with poor access to water, a backcountry airplane landing, a float trip on an intermittent river, or a difficult drive. The last option was the only one that made sense for us.

We left Hanksville UT before sunrise and had about a 20-minute drive on pavement before turning off at Poison Spring Canyon where the track follows the bottom of the canyon in and out of the waterway, through mud and sand and pools of water. This canyon is frequently impassable, but it had been bladed since the last flash flood and was mostly lots of fun, with only a few sections of real 4WD. It took us about 40 minutes to drive 11 miles to where the Black Jump road turns off (#1 on the map below). This next road follows a bench between cliffs; it was put in during the 1950s for uranium exploration and, as far as I know, hasn’t been maintained since then. This track had caused me a lot of stress during trip planning and indeed it was a bit exciting: it is partially blocked by rocks, goes right next to cliff edges, has sinkholes in the clay that could eat a wheel, and has some sections of real high-clearance 4WD. It took us about an hour to drive five miles to where the track is finally blocked for good by a bus-sized rock that fell from the cliffs above (#3).

happymap (Map credit: USGS with annotations by rockgremlin.)

So there we are — an 8 year old, a ten year old, and me — parked on a ledge halfway down the 1400-foot deep Dirty Devil River gorge, probably 10 miles from the nearest human being. We continued along the deteriorating mining road on foot; there’s a lot of petrified wood including some entire logs, which are really fun to see. After a while (well past #4 — the folks who made that map dropped down to the river too early) there’s a nice break in the cliffs and we picked our way down to the river, which was flowing in the 80-90 cfs range. We all took off our shoes; the younger boy crossed holding my hand and the older one crossed on his own. The mud was nasty and there was a bit of quicksand, but nothing too hard to avoid. At this point we were at the mouth of Happy Canyon (#5) and we had lunch on the river bank.

005

010

Happy Canyon rapidly narrows down and remains narrow for most of a mile, and while it isn’t actually a slot canyon (where you can consistently touch both walls) it is deep and convoluted.

025

028

041

062

086

118

We could have stayed in the narrows for hours, but we had a long (and warm, even in March) hike out and I didn’t want to drive the Black Jump road in the dark. We cooked dinner at the junction with the main Poison Spring road, and then we made it back to Hanksville by dusk.

130

158

173

The next day was less eventful: we visited a little-visited mesa top and found a place where wind or floods had created a perfect little beach along the Fremont River.

203

215

243

On the final day of this quick trip I wanted to visit yet another out-of-the way spot. The boys endured a breakfast of beef jerky and gatorade, a routefinding debacle, an extremely muddy river crossing, and a longish and not-inspiring hike. As a reward, we got to spend an hour or two on the moon before heading home.

268

275

306

Overall this was a successful trip, though we did run into one person while hiking.

SQLite with a Fine-Toothed Comb

One of the things I’ve been doing at Trust-in-Soft is looking for defects in open-source software. The program I’ve spent the most time with is SQLite: an extremely widely deployed lightweight database. At ~113 KSLOC of pointer-intensive code, SQLite is too big for easy static verification. On the other hand, it has an incredibly impressive test suite that has already been used to drive dynamic tools such as Valgrind, ASan, and UBSan. I’ve been using these tests with tis-interpreter.

Here I try to depict the various categories of undefined behaviors (UBs) where this post is mainly about the blue-shaded area:

This figure leaves out many UBs (there are a few hundred in all) and is not to scale.

For honesty’s sake I should add that using tis-interpreter isn’t painless (it wasn’t designed as a from-scratch C interpreter, but rather is an adaptation of a sound formal methods tool). It is slower even than Valgrind. It has trouble with separately compiled libraries and with code that interacts with the system, such as mmap() calls. As I tried to show in the figure above, when run on code that is clean with respect to ASan and UBSan, it tends to find fiddly problems that a lot of people aren’t going to want to hear about. In any case, one of the reasons that I’m pushing SQLite through tis-interpreter is to help us find and fix the pain points.

Next let’s look at some bugs. I’ve been reporting issues as I go, and a number of things that I’ve reported have already been fixed in SQLite. I’ll also discuss how these bugs relate to the idea of a Friendly C dialect.

Values of Dangling Pointers

SQLite likes to use — but not dereference — pointers to heap blocks that have been freed. It did this at quite a few locations. For example, at a time when zHdr is dangling:

if( (zHdr>=zEndHdr && (zHdr>zEndHdr 
  || offset64!=pC->payloadSize))
 || (offset64 > pC->payloadSize)
){
  rc = SQLITE_CORRUPT_BKPT;
  goto abort_due_to_error;
}

These uses are undefined behavior, but are they compiled harmfully by current C compilers? I have no evidence that they are, but in other situations the compiler will take advantage of the fact that a pointer is dangling; see this post and also Winner #2 here. You play with dangling pointers at your own risk. Valgrind and ASan make no attempt to catch these uses, as far as I know.

Using the value of a dangling pointer is a nettlesome UB, causing inconvenience while — as far as I can tell — giving the optimizer almost no added power in realistic situations. Eliminating it is a no-brainer for Friendly C.

Uses of Uninitialized Storage

I found several reads of uninitialized storage. This is somewhere between unspecified and undefined behavior.

One idiom was something like this:

int dummy;
some sort of loop {
  ...
  // we don't care about function()'s return value
  // (but its other callers might)
  dummy += function();
  ...
}
// dummy is not used again

Here the intent is to avoid a compiler warning about an ignored return value. Of course a better alternative is to initialize dummy; the compiler can still optimize away the unwanted bookkeeping if function() is inlined or otherwise specialized.

At least one uninitialized read that we found was potentially harmful, though we couldn’t make it behave unpredictably. Also, it had not been found by Valgrind. Both of these facts — the predictability and the lack of an alarm — might be explained by the compiler reusing a stack slot that had previously contained a different local variable. Of course we must not count on such coincidences working out well.

A Friendly C could ignore reads of uninitialized storage based on the idea that tool support for detecting this class of error is good enough. This is the solution I would advocate. A heavier-handed alternative would be compiler-enforced zeroing of heap blocks and automatic variables.

Out-of-Bounds Pointers

In C you are not allowed to compute — much less use or dereference — a pointer that isn’t inside an object or one element past its end.

SQLite’s vdbe struct has a member called aMem that uses 1-based array indexing. To avoid wasting an element, this array is initialized like this:

p->aMem = allocSpace(...);
p->aMem--;

I’ve elided a bunch of code, the full version is in sqlite3VdbeMakeReady() in this file. The real situation is more complicated since allocSpace() isn’t just a wrapper for malloc(): UB only happens when aMem points to the beginning of a block returned by malloc(). This could be fixed by avoiding the 1-based addressing, by allocating a zero element that would never be used, or by reordering the struct fields.

Other out-of-bounds pointers in SQLite were computed when pointers into character arrays went past the end. This particular class of UB is commonly seen even in hardened C code. It is probably only a problem when either the pointer goes far out of bounds or else when the object in question is allocated near the end of the address space. In these cases, the OOB pointer can wrap, causing a bounds check to fail to trigger, potentially causing a security problem. The full situation involves an undesirable UB-based optimization and is pretty interesting. A good solution, as the LWN article suggests, is to move the bounds checks into the integer domain. The Friendly C solution would be to legitimize creation of, and comparison between, OOB pointers. Alternately, we could beef up the sanitizers to complain about these things.

Illegal Arguments

It is UB to call memset(), memcpy(), and other library functions with invalid or null pointer arguments. GCC actively exploits this UB to optimize code. SQLite had a place where it called memset() with an invalid pointer and another calling memcpy() with a null pointer. In both cases the length argument was zero, so the calls were otherwise harmless. A Friendly C dialect would only require each pointer to refer to as much storage as the length argument implies is necessary, including none at all.

Comparisons of Pointers to Unrelated Objects

When the relational operators >, >=, <, <= are used to compare two pointers, the program has undefined behavior unless both pointers refer to the same object, or to the location one past its end. SQLite, like many other programs, wants to do these kinds of comparisons. The solution is to reduce undefined behavior to implementation defined behavior by casting to uintptr_t and comparing the resulting integers. For example, SQLite now has this method for checking if a pointer (that might be to another object) lies within a memory range:

# define SQLITE_WITHIN(P,S,E) \
    ((uintptr_t)(P)>=(uintptr_t)(S) && \
     (uintptr_t)(P)<(uintptr_t)(E))

Uses of this macro can be found in btree.c.

With respect to pointer comparisons, tis-interpreter’s (and Frama-C’s) intentions are stronger than just detecting undefined behavior: we want to ensure that execution is deterministic. Comparisons between unrelated objects destroy determinism because the allocator makes no guarantees about their relative locations. On the other hand, if the pair of comparisons in a SQLITE_WITHIN call is treated atomically, and if S and E point into the same object, there is no determinism violation. We added a “within” builtin to Frama-C that can be used without violating determinism and also a bare pointer comparison that — if used — breaks the guarantee that Frama-C’s results hold for all possible allocation orders. Sound analysis of programs that depend on the relative locations of different allocated objects is a research problem and something like a model checker would be required.

In Friendly C, comparisons of pointers to unrelated objects would act as if they had already been cast to uintptr_t. I don’t think this ties the hands of the optimizer at all.

Summary

SQLite is a carefully engineered and thoroughly tested piece of software. Even so, it contains undefined behaviors because, until recently, no good checker for these behaviors existed. If anything is going to save us from UB hell, it’s tools combined with developers who care to listen to them. Richard Hipp is a programming hero who always responds quickly and has been receptive to the fiddly kinds of problems I’m talking about here.

What to Do Now

Although the situation is much, much better than it was five years ago, C and C++ developers will not be on solid ground until every undefined behavior falls into one of these two categories:

  • Erroneous UBs for which reliable and ubiquitous (but perhaps optional and inefficient) detectors exist. These can work either at compile time or run time.
  • Benign behaviors for which compiler developers have provided a documented semantics.

The Strict Aliasing Situation is Pretty Bad

I’ll start with a quick review of the strict aliasing rules in C and C++ and then present some less well-known material.

Strict Aliasing

Compiler optimizations are often shot down by the potential for pointers to be aliases. For example, although we might naively expect a compiler to optimize this function to return zero, that cannot happen because x and y might refer to the same location:

int foo(int *x, int *y) {
  *x = 0;
  *y = 1;
  return *x;
}

Generated code typically looks like this:

foo:    movl    $0, (%rdi)
        movl    $1, (%rsi)
        movl    (%rdi), %eax
        ret

Failure to optimize is particularly frustrating when we know for sure that x and y are not aliases. The “restrict” keyword in C was introduced to help solve this sort of problem but we’re not going to talk about that today. Rather, we’re going to talk about an orthogonal solution, the aliasing rules in C and C++ that permit the compiler to assume that an object will not be aliased by a pointer to a different type. Often this is called “strict aliasing” although that term does not appear in the standards. Consider, for example, this variant of the program above:

int foo(int *x, long *y) {
  *x = 0;
  *y = 1;
  return *x;
}

Since a pointer-to-int and a pointer-to-long may be assumed to not alias each other, the function can be compiled to return zero:

foo2:   movl    $0, (%rdi)
        xorl    %eax, %eax
        movq    $1, (%rsi)
        ret

As we see here, the aliasing rules give C and C++ compilers leverage that they can use to generate better code. On the other hand, since C is a low-level language and C++ can be used as a low-level language, both permit casting between pointer types, which can end up creating aliases that violate the compiler’s assumptions. For example, we might naively write code like this to access the representation of a floating point value:

unsigned long bogus_conversion(double d) {
  unsigned long *lp = (unsigned long *)&d;
  return *lp;
}

This function is undefined under the aliasing rules and while it happens to be compiled into the same code that would be emitted without the strict aliasing rules, it is easy to write incorrect code that looks like it is getting broken by the optimizer:

#include <stdio.h>

long foo(int *x, long *y) {
  *x = 0;
  *y = 1;
  return *x;
}

int main(void) {
  long l;
  printf("%ld\n", foo((int *)&l, &l));
}

$ gcc-5 strict.c ; ./a.out
1
$ gcc-5 -O2 strict.c ; ./a.out
0
$ clang strict.c ; ./a.out
1
$ clang -O2 strict.c ; ./a.out
0

An exception to the strict aliasing rules is made for pointers to character types, so it is always OK to inspect an object’s representation via an array of chars. This is necessary to make memcpy-like functions work properly.

So far, this is very well known. Now let’s look at a few consequences of strict aliasing that are perhaps not as widely known.

Physical Subtyping is Broken

An old paper that I like uses the term “physical subtyping” to refer to the struct-based implementation of inheritance in C. Searching for “object oriented C” returns quite a few links. Additionally, many large C systems (the Linux kernel for example) implement OO-like idioms. Any time this kind of code casts between pointer types and dereferences the resulting pointers, it violates the aliasing rules. Many aliasing rule violations can be found in this book about object oriented C. Some build systems, such as Linux’s, invoke GCC with its -fno-strict-aliasing flag to avoid problems.

Update based on some comments from Josh Haberman and Sam Tobin-Hochstadt: It looks like the specific case where the struct representing the derived type includes its parent as its first member should not trigger UB. The language in this part of the standard is very hard to parse out.

This program from the Cerberus project illustrates the problem with changing the type of a pointer to struct:

#include <stdio.h>

typedef struct { int i1; } s1;
typedef struct { int i2; } s2;

void f(s1 *s1p, s2 *s2p) {
  s1p->i1 = 2;
  s2p->i2 = 3;
  printf("%i\n", s1p->i1);
}

int main() {
  s1 s = {.i1 = 1};
  f(&s, (s2 *)&s);
}

$ gcc-5 effective.c ; ./a.out
3
$ gcc-5 -O2 effective.c ; ./a.out
2
$ clang-3.8 effective.c ; ./a.out
3
$ clang-3.8 -O2 effective.c ; ./a.out
3

Chunking Optimizations Are Broken

Code that processes bytes one at a time tends to be slow. While an optimizing compiler can sometimes make a naive character-processing loop much faster, in practice we often need to help the compiler out by explicitly processing word-sized chunks of bytes at a time. Since the data reinterpretation is generally done by casting to a non-character-typed pointer, the resulting accesses are undefined. Search the web for “fast memcpy” or “fast memset”: many of the hits will return erroneous code. Example 1, example 2, example 3. Although I have no evidence that it is being miscompiled, OpenSSL’s AES implementation uses chunking and is undefined.

One way to get chunking optimizations without UB is to use GCC’s may_alias attribute, as seen here in Musl. This isn’t supported even by Clang, as far as I know.

Offset Overlap is Bad

Here is a devilish little program by Richard Biener and Robbert Krebbers that I found via the Cerberus report:

#include <stdio.h>
#include <stdlib.h>

struct X {
  int i;
  int j;
};

int foo(struct X *p, struct X *q) {
  q->j = 1;
  p->i = 0;
  return q->j;
}

int main() {
  unsigned char *p = malloc(3 * sizeof(int));
  printf("%i\n", foo((struct X *)(p + sizeof(int)),
                     (struct X *)p));
}

It is ill-formed according to LLVM and GCC:

$ clang-3.8 krebbers.c ; ./a.out
0
$ clang-3.8 -O2 krebbers.c ; ./a.out
1
$ gcc-5 krebbers.c ; ./a.out
0
$ gcc-5 -O2 krebbers.c ; ./a.out
1

int8_t and uint8_t Are Not Necessarily Character Types

This bug (and some linked discussions) indicate that compiler developers don’t necessarily consider int8_t and uint8_t to be character types for aliasing purposes. Wholesale replacement of character types with standard integer types — as advocated here, for example — would almost certainly lead to interesting strict aliasing violations when the resulting code was run through a compiler that doesn’t think int8_t and uint8_t are character types. Happily, no compiler has done this yet (that I know of).

Summary

A lot of C code is broken under strict aliasing. Separate compilation is probably what protects us from broader compiler exploitation of the brokenness, but it is a very poor kind of protection. Static and dynamic checking tools are needed. If I were writing correctness-oriented C that relied on these casts I wouldn’t even consider building it without -fno-strict-aliasing.

Pascal Cuoq provided feedback on a draft of this piece.

A Variable Argument Hazard

Variable argument functions in C and C++ can be tricky to use correctly, and they typically only get compiler-based type checking for special cases such as printf(). Recently I ran across an entertaining variable argument bug using tis-interpreter.

Although I will use code from SQLite to illustrate this bug, this bug has not been found in SQLite itself. Rather, it was in some code used to test that database.

SQLite’s API has a configuration function that takes variable arguments:

int sqlite3_config(int op, ...);

Several of its configuration commands, such as this one, expect pointer-typed arguments:

case SQLITE_CONFIG_LOG: {
  typedef void(*LOGFUNC_t)(void*,int,const char*);
  sqlite3GlobalConfig.xLog = va_arg(ap, LOGFUNC_t);
  sqlite3GlobalConfig.pLogArg = va_arg(ap, void*);
  break;
}

We’re only allowed to execute this code when the first two variable arguments are respectively compatible with LOGFUNC_t and void *, otherwise the behavior is undefined.

The test code contains calls that look like this:

sqlite3_config(SQLITE_CONFIG_LOG, 0, pLog); 

At first glance there’s no problem — normally it is fine to pass a 0 to a function that expects a pointer argument: the 0 is turned into an appropriately-typed null pointer. On the other hand, when a function takes variable arguments (or when it lacks a prototype) the compiler cannot do this kind of caller-side type conversion and instead a different set of rules, the “default argument promotions” are applied, resulting in a zero-valued int being passed to sqlite3_config() in the first variable argument position. Another fun aspect of the default argument promotions is that they force all float arguments to be promoted to double. These rules seem to be the same in C++ though they would fire less often due to C++’s stricter rules about function prototypes.

This problem was not noticed because the code happens to be compiled benignly on both x86 and x64 platforms. On x64 Linux, pointers and ints are not the same size, but the calling convention says that the first six integer and pointer arguments are passed in 64-bit registers — so the problem gets covered up by an implicit promotion to 64 bits. The ABI does not guarantee that the higher bits are zeroed in this case, but in this example they happen to be cleared by GCC and LLVM.

On x86 Linux, all arguments are passed in memory but since pointers and integers are the same size, the problem is again covered up. An ABI that represents a null pointer differently than a zero-valued long integer would expose this bug right away, as would an x64 program that passes at least six variable arguments. In any case, the fix is easy:

sqlite3_config(SQLITE_CONFIG_LOG, (LOGFUNC_t)0, pLog); 

In summary, be careful with the default argument promotions that are done when calling variable argument functions.

Thanks to Pascal Cuoq who provided comments on this piece.

Also see this article by Rich Felker and this one by Jens Gustedt.

Python Exercises for Kids

For the last year or so I’ve been giving Python exercises to my 11 year old. I thought I’d share some of them. If any of you have been doing similar things, I’d love to hear what worked for you. I think it is helpful that I’m not much of a Python programmer, this forces him to read the documentation. The other day he said “Wow– Stack Overflow is great!”

Fibonacci

Print the Fibonacci series, useful for teaching basics of looping.

Number Guessing Game

The user thinks of a number between 1 and 100. The computer tries to guess it based on feedback about whether the previous guess was too high or low. This one was great for learning about the kinds of off-by-one errors that one customarily runs into while implementing a binary search.

Binary Printer

Print the non-negative binary integers in increasing order. This one forces some representation choices.

Palindrome Recognizer

Recognizing “A man, a plan, a canal — Panama!” as a palindrome requires some string manipulation.

Door Code Recognizer

Our Paris apartment building will let you in if you enter the correct five-digit sequence regardless of how many incorrect digits you previously entered. I thought replicating this behavior in Python would be slightly tricky for the kid but he saw that a FIFO could easily be created from a list.

Monte Carlo Pi Estimator

If you generate random points in the square between -1,-1 and 1,1, the fraction of points that lie inside the unit circle will approach pi/4. The kid got a kick out of seeing this happen. Of course I had to explain the derivation of pi/4 and the formula for a circle since he hasn’t seen this stuff in school yet. Also of course we could have simply worked in quadrant one but that seemed to make the explanations more complicated.

I should perhaps add that I explained this exercise to my kid very differently than I’m explaining it here. We started by drawing a box containing a circle and then pretended to throw darts into the box. The concepts are not hard: all we need to understand is random sampling and the area of a circle. Math gets made to seem a lot harder than it is, in many cases, and also we tend to conflate the difficulty with the order in which the concepts are presented in the school system.

Turtle Graphics Interpreter

The current project is implementing a turtle that takes four commands:

  • forward n
  • right-turn n
  • left-turn n
  • color c

So far we’ve been restricting angles to 0, 90, 180, 270 (and no radians yet…) but I don’t think sine and cosine will be hard concepts to explain in this context. Also, so far the turtle commands are just a series of function calls, the parser will come later. I wonder what would be the simplest loop construct for this turtle language? Perhaps “repeat n” followed by some sort of block construct.