Embedded in Academia

Category: Computer Science

Write Fuzzable Code

Fuzzing is sort of a superpower for locating vulnerabilities and other software defects, but it is often used to find problems baked deeply into already-deployed code. Fuzzing should be done earlier, and moreover developers should spend some effort making their code more amenable to being fuzzed. This post is a non-comprehensive, non-orthogonal list of ways…

August 19, 2019
Design and Evolution of C-Reduce (Part 2)

Part 1 of this series introduced C-Reduce and showed how it combines a domain-independent core with a large collection of domain-specific passes in order to create a highly effective test-case reducer for C and C++ code. This part tells the rest of the story and concludes. Parallel Test-Case Reduction C-Reduce’s second research contribution is to…

August 12, 2019
Design and Evolution of C-Reduce (Part 1)

[This piece is posted in parallel on the IEEE Software blog. Karim Ali copyedited.] Since 2008, my colleagues and I have developed and maintained C-Reduce, a tool for programmatically reducing the size of C and C++ files that trigger compiler bugs. C-Reduce also usually does a credible job reducing test cases in languages other than…

July 9, 2019
It’s Time for a Modern Synthesis Kernel

Alexia Massalin’s 1992 PhD thesis has long been one of my favorites. It promotes the view that operating systems can be much more efficient than then-current operating systems via runtime code generation, lock-free synchronization, and fine-grained scheduling. In this piece we’ll only look at runtime code generation, which can be cleanly separated from the other…

July 2, 2019
Verifying Popcount

Popcount is the function that returns the number of set bits in its argument. Showing that a popcount implementation does what it claims to do has become one of my favorite examples to use when I need to quickly show students how we can reason about programs mathematically. Something like a selection sort is probably…

April 24, 2019
Walking or Biking to NSF

Since the National Science Foundation funds a large fraction of academic computer science research in the USA, we often end up traveling to Washington to visit the NSF. This post is just to say that if you are traveling light, if you need some exercise, if you have a bit of free time, and if the…

December 13, 2018
Synthesizing Constants

(See this blog post for a short introduction to synthesis, or this paper for a long one.) In this piece I want to discuss an aspect of program synthesis that sounds like it should be easy, but isn’t: synthesizing constant values. For example, consider trying to synthesize an optimized x86-64 implementation of this code: The…

November 7, 2018
Learning When Values are Changed by Implicit Integer Casts

C and C++ perform implicit casts when, for example, you pass an integer-typed variable to a function that expects a different type. When the target type is wider, there’s no problem, but when the target type is narrower or when it is the same size and the other signedness, integer values may silently change when…

November 2, 2018
What’s the difference between an integer and a pointer?

(This piece is an alternate introduction and advertisement for a soon-to-be-published research paper.) In an assembly language we typically don’t have to worry very much about the distinction between pointers and integers. Some instructions happen to generate addresses whereas others behave arithmetically, but underneath there’s a single data type: bitvectors. At the opposite end of…

September 18, 2018
Future Directions for Optimizing Compilers

I wanted to write a manifesto-ish sort of piece about what compilers are supposed to look like in the future. Nuno was the obvious coauthor since I’ve talked to him about this topic so much that I’m overall not really sure which parts started out as his ideas and which were mine. The article didn’t…

September 9, 2018