Embedded in Academia

Category: Computer Science

Formal-Methods-Based Bugfinding for LLVM’s AArch64 Backend

[This piece is co-authored by Ryan Berger and Stefan Mada (both Utah CS undergrads), by Nader Boushehri, and by John Regehr.] An optimizing compiler traditionally has three main parts: a frontend that translates a source language into an intermediate representation (IR), a “middle end” that rewrites IR into better IR, and then a backend that…

June 6, 2022
Responsible and Effective Bugfinding

NB: This piece is not about responsible disclosure of security issues. For almost as long as people have written code, we have also worked to create methods for finding software defects. Much more recently, it has become common to treat “external bug finding” — looking for defects in other people’s software — as an activity…

August 17, 2020
Alive2 Part 3: Things You Can and Can’t Do with Undef in LLVM

[Also see Part 1 and Part 2 in this series.] Let’s talk about these functions: unsigned add(unsigned x) { return x + x; } unsigned shift(unsigned x) { return x << 1; } From the point of view of the C and C++ abstract machines, their behavior is equivalent: in a program you’re writing, you…

July 31, 2020
You Might as Well Be a Great Copy Editor

An early draft of a paper, blog post, grant proposal, or other piece of technical writing typically has many problems. Some of these are high-level issues, such as weak motivation, sections in the wrong order, or a key description that is difficult to understand because it lacks an accompanying figure. These problems need to be…

June 24, 2020
The Saturation Effect in Fuzzing

This piece is co-authored by Alex Groce and John Regehr. Here’s something we’ve seen happen many times: We apply a fuzzer to some non-trivial system under test (SUT), and initially it finds a lot of bugs. As these bugs are fixed, the SUT sort of becomes immune to this fuzzer: the number of new bugs…

June 17, 2020
Alive2 Part 2: Tracking miscompilations in LLVM using its own unit tests

[This piece is co-authored by Nuno P. Lopes and John Regehr.] Alive2 is a formal verification framework for LLVM optimizations. It includes multiple tools, including a plugin for `opt’ to verify whether the optimizations just run are correct or not. We gave an introduction to Alive2 in a previous post. A few years ago we…

June 3, 2020
Alive2 Part 1: Introduction

[This piece is co-authored by Nuno P. Lopes and John Regehr.] Compiler bugs threaten the correctness of almost any computer system that uses compiled code. Translation validation is a path towards reliably correct compilation that works by checking that an individual execution of the compiler did the right thing. We created a tool, Alive2, that…

March 12, 2020
Precision Opportunities for Demanded Bits in LLVM

[Although this post was written to stand by itself, it builds on the previous one. It is authored by Jubi Taneja, Zhengyang Liu, and John Regehr.] When designing computer systems, it can be useful to avoid specifying behaviors too tightly. For example, we might specify that a math library function only needs to return a…

January 22, 2020
Testing Dataflow Analyses for Precision and Soundness

[This piece is co-authored by Jubi Taneja, Zhengyang Liu, and John Regehr; it’s a summary of some of the findings from a paper that we just recently completed the camera ready copy for, that is going to be published at CGO (Code Generation and Optimization) 2020.] Update from Jan 12 2020: Looks like there’s a…

January 10, 2020
Helping Generative Fuzzers Avoid Looking Only Where the Light is Good, Part 1

Let’s take a second to recall this old joke: A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost…

November 4, 2019