Embedded in Academia

Category: Software Correctness

Formal-Methods-Based Bugfinding for LLVM’s AArch64 Backend

[This piece is co-authored by Ryan Berger and Stefan Mada (both Utah CS undergrads), by Nader Boushehri, and by John Regehr.] An optimizing compiler traditionally has three main parts: a frontend that translates a source language into an intermediate representation (IR), a “middle end” that rewrites IR into better IR, and then a backend that…

June 6, 2022
Responsible and Effective Bugfinding

NB: This piece is not about responsible disclosure of security issues. For almost as long as people have written code, we have also worked to create methods for finding software defects. Much more recently, it has become common to treat “external bug finding” — looking for defects in other people’s software — as an activity…

August 17, 2020
Alive2 Part 3: Things You Can and Can’t Do with Undef in LLVM

[Also see Part 1 and Part 2 in this series.] Let’s talk about these functions: unsigned add(unsigned x) { return x + x; } unsigned shift(unsigned x) { return x << 1; } From the point of view of the C and C++ abstract machines, their behavior is equivalent: in a program you’re writing, you…

July 31, 2020
The Saturation Effect in Fuzzing

This piece is co-authored by Alex Groce and John Regehr. Here’s something we’ve seen happen many times: We apply a fuzzer to some non-trivial system under test (SUT), and initially it finds a lot of bugs. As these bugs are fixed, the SUT sort of becomes immune to this fuzzer: the number of new bugs…

June 17, 2020
Alive2 Part 2: Tracking miscompilations in LLVM using its own unit tests

[This piece is co-authored by Nuno P. Lopes and John Regehr.] Alive2 is a formal verification framework for LLVM optimizations. It includes multiple tools, including a plugin for `opt’ to verify whether the optimizations just run are correct or not. We gave an introduction to Alive2 in a previous post. A few years ago we…

June 3, 2020
Alive2 Part 1: Introduction

[This piece is co-authored by Nuno P. Lopes and John Regehr.] Compiler bugs threaten the correctness of almost any computer system that uses compiled code. Translation validation is a path towards reliably correct compilation that works by checking that an individual execution of the compiler did the right thing. We created a tool, Alive2, that…

March 12, 2020
Testing Dataflow Analyses for Precision and Soundness

[This piece is co-authored by Jubi Taneja, Zhengyang Liu, and John Regehr; it’s a summary of some of the findings from a paper that we just recently completed the camera ready copy for, that is going to be published at CGO (Code Generation and Optimization) 2020.] Update from Jan 12 2020: Looks like there’s a…

January 10, 2020
Helping Generative Fuzzers Avoid Looking Only Where the Light is Good, Part 1

Let’s take a second to recall this old joke: A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost…

November 4, 2019
Write Fuzzable Code

Fuzzing is sort of a superpower for locating vulnerabilities and other software defects, but it is often used to find problems baked deeply into already-deployed code. Fuzzing should be done earlier, and moreover developers should spend some effort making their code more amenable to being fuzzed. This post is a non-comprehensive, non-orthogonal list of ways…

August 19, 2019
Verifying Popcount

Popcount is the function that returns the number of set bits in its argument. Showing that a popcount implementation does what it claims to do has become one of my favorite examples to use when I need to quickly show students how we can reason about programs mathematically. Something like a selection sort is probably…

April 24, 2019