Embedded in Academia

Category: Compilers

Taming Undefined Behavior in LLVM

Earlier I wrote that Undefined Behavior != Unsafe Programming, a piece intended to convince you that there’s nothing inherently wrong with undefined behavior as long as it isn’t in developer-facing parts of the system. Today I want to talk about a new paper about undefined behavior in LLVM that’s going to be presented in June…

April 14, 2017
Do Expressive Programming Languages Always Have Undefined Behavior?

In the Hacker News comments on one of my previous posts about undefined behavior, someone said this: AFAIK Gödel’s incompleteness theorems imply that _any_ language will have at least some undefined behaviour. Let’s take a quick look at this statement, keeping in mind that incompleteness and undecidability can be remarkably tricky topics. Some years ago…

March 6, 2017
Undefined Behavior != Unsafe Programming

Undefined behavior (UB) in C and C++ is a clear and present danger to developers, especially when they are writing code that will execute near a trust boundary. A less well-known kind of undefined behavior exists in the intermediate representation (IR) for most optimizing, ahead-of-time compilers. For example, LLVM IR has undef and poison in…

February 14, 2017
Detecting Strict Aliasing Violations in the Wild

Type-based alias analysis, where pointers to different types are assumed to point to distinct objects, gives compilers a simple and effective way to disambiguate memory references in order to generate better code. Unfortunately, C and C++ make it easy for programmers to violate the assumptions upon which type-based alias analysis is built. “Strict aliasing” refers…

February 1, 2017
Testing LLVM

[This piece is loosely a followup to this one.] Background Once a piece of software reaches a certain size, it is guaranteed to be loosely specified and not completely understood by any individual. It gets committed to many times per day by people who are only loosely aware of each others’ work. It has many…

January 13, 2017
A Tourist’s Guide to the LLVM Source Code

In my Advanced Compilers course last fall we spent some time poking around in the LLVM source tree. A million lines of C++ is pretty daunting but I found this to be an interesting exercise and at least some of the students agreed, so I thought I’d try to write up something similar. We’ll be…

January 5, 2017
Principles for Undefined Behavior in Programming Language Design

I’ve had a post with this title on the back burner for years but I was never quite convinced that it would say anything I haven’t said before. Last night I watched Chandler Carruth’s talk about undefined behavior at CppCon 2016 and it is good material and he says it better than I think I…

October 10, 2016
Advanced Compilers Weeks 3-5

This continues a previous post. We went through the lattice theory and introduction to dataflow analysis parts of SPA. I consider this extremely good and important material, but I’m afraid that the students looked pretty bored. It may be the case that this material is best approached by first looking at practical aspects and only…

September 28, 2016
Advanced Compilers Weeks 1 and 2

This post will be of somewhat narrow interest; it’s a quick attempt to take my lecture notes for the first weeks of an advanced compilers course and turn them into something a bit more readable. I’m not using slides for this class. Motivation The great thing about an advanced course (on any topic) is that…

September 5, 2016
Solutions to Integer Overflow

Humans are typically not very good at reasoning about integers with limited range, whereas computers fundamentally work with limited-range numbers. This impedance mismatch has been the source of a lot of bugs over the last 50 years. The solution comes in multiple parts. In most programming languages, the default integer type should be a bignum:…

September 2, 2016