Embedded in Academia

Author: regehr

Looking for Missed Alarm Bugs in a Formal Verification Tool

[This piece is co-authored with Vsevolod Livinskii.] Formal verification isn’t some sort of magic pixie dust that we sprinkle over a computer system to make it better. Real formal verification involves a lot of the same kind of difficult, nasty, grungy engineering work that any other systems-level job involves. Furthermore, the verification tools themselves are…

September 4, 2024
Dataflow Analyses and Compiler Optimizations that Use Them, for Free

Compilers can be improved over time, but this is a slow process. “Proebsting’s Law” is an old joke which suggested that advances in compiler optimization will double the speed of a computation every 18 years — but if anything this is optimistic. Slow compiler evolution is never a good thing, but this is particularly problematic…

April 20, 2024
Why Do Peephole Optimizations Work?

In its original form, a peephole optimization applied to a collection of instructions located close together in a program. For example, in a register transfer language we might find this sequence of instructions: r0 = xor r8, -1 r1 = xor r9, -1 r0 = and r0, r1 Here, assuming the two’s complement representation, -1…

November 1, 2023
Formal-Methods-Based Bugfinding for LLVM’s AArch64 Backend

[This piece is co-authored by Ryan Berger and Stefan Mada (both Utah CS undergrads), by Nader Boushehri, and by John Regehr.] An optimizing compiler traditionally has three main parts: a frontend that translates a source language into an intermediate representation (IR), a “middle end” that rewrites IR into better IR, and then a backend that…

June 6, 2022
High-Throughput, Formal-Methods-Assisted Fuzzing for LLVM

[This piece is coauthored by Yuyou Fan and John Regehr] Mutation-based fuzzing is based on the idea that new, bug-triggering inputs can often be created by randomly modifying existing, non-bug-triggering inputs. For example, if we wanted to find bugs in a PDF reader, we could grab a bunch of PDF files off the web, mutate…

May 31, 2022
A Close Look at a Spinlock

The spinlock is the most basic mutual exclusion primitive provided by a multiprocessor operating system. Spinlocks need to protect against preemption on the current CPU (typically by disabling interrupts, but we’ll ignore that aspect in this post) and also against attempts by other cores to concurrently access the critical section (by using atomic memory operations).…

November 6, 2021
llvm-reduce

Test-case reduction is more or less a necessity when debugging failures of complex programs such as compilers. Automated test-case reduction is useful not only because it allows developers to avoid wasting time reducing inputs by hand, but also because it supports new techniques such as automatically triaging bulk failures seen in the field or during…

May 13, 2021
Responsible and Effective Bugfinding

NB: This piece is not about responsible disclosure of security issues. For almost as long as people have written code, we have also worked to create methods for finding software defects. Much more recently, it has become common to treat “external bug finding” — looking for defects in other people’s software — as an activity…

August 17, 2020
Alive2 Part 3: Things You Can and Can’t Do with Undef in LLVM

[Also see Part 1 and Part 2 in this series.] Let’s talk about these functions: unsigned add(unsigned x) { return x + x; } unsigned shift(unsigned x) { return x << 1; } From the point of view of the C and C++ abstract machines, their behavior is equivalent: in a program you’re writing, you…

July 31, 2020
The Gods Pocket Peak Trail

Years ago my friend Derek heard that the Jarbidge Wilderness — a remote, mountainous area along the Idaho-Nevada border — is one of the least-visited wilderness areas in the USA, and we’ve wanted to visit since then. There’s little good information about this area online, but a book I had sitting around listed The Gods…

July 23, 2020