Crashy Compiler Flags

This post is for fun, no deep thoughts will be presented. For a research project that’s not yet ready to write up, I needed a bunch real programs (as opposed to, for example, programs generated by Csmith) that cause compilers to crash. So I built a few hundred randomly chosen revisions of GCC and LLVM/Clang and then built about 1,700 packages from source on a 64-bit Ubuntu box using a “compiler interceptor” — a collection of aliases such as “gcc” and “g++” that lead to a script that eventually runs the intended compiler but first invokes a few of the random compiler versions. If one of the random compilers crashes, the interceptor saves enough information to reproduce the problem: basically the compiler command line and the preprocessed source file. Not counting a large number of trivial invocations by autotools, the compiler interceptor was called about 274,000 times, resulting in about 4,000 reproducible compiler crashes (about 2,000 crashes were not considered usefully reproducible by my scripts because, for example, gcc was being used as a linker or assembler).

Modern build systems are not very pretty and consequently compiler command lines are fat and redundant: during my compiling party, the compiler was invoked with a command lines up to about 20 KB long. To find out which arguments were leading to compiler crashes, I wrote a little delta debugger that iteratively attempted to remove arguments until a fixpoint was reached under the rule that we can only remove arguments that don’t make the crash go away. Additionally, this program attempted to downgrade the optimizer flags, for example turning -Os and -O3 into -O2, -O2 into -O1, etc. The tables below show the results ranked by number of occurrences in the ~4000 reproducible crashes.

When reading the results, keep a few things in mind:

  • Every option that is listed is actually needed to make the compiler crash. Why would -Wall induce a crash? I didn’t look into it but there must be a bug in the code emitting warnings.
  • These results don’t reflect the behavior of the current versions of GCC and LLVM/Clang, but rather the range of revisions that I tested. These are roughly GCC 4.0-current and LLVM/Clang 2.6-current. As the numbers show, clang++ was an extremely immature compiler early on; this of course doesn’t imply anything about the current quality of this compiler.

gcc

# crashes options
12 -O1
9 -O3
8 -Wall -O2
6 -Os -g -fno-asynchronous-unwind-tables
4 -O2
1 -std=gnu99 -O1
1 -std=gnu99 -O3
1 (none)

g++

# crashes options
147 (none)
13 -O1
3 -std=c++0x
2 -O3

clang

# crashes options
644 -mavx -O2
206 (none)
152 -mavx
63 -mavx -O1
58 -fexceptions
5 -fgnu89-inline -fexceptions
2 -O1 -fPIC
2 -fgnu89-inline -fPIC
1 -fgnu89-inline -g -fPIC
1 -fPIC
1 -O1
1 -O3
1 -Wall

clang++

# crashes options
2829 (none)
8 -std=c++0x
4 -std=c++0x -g
1 -Wall -std=c++0x
1 -g -O1
1 -g
1 -O1

One thing that comes to mind while looking at these results is that I’d have expected -O3 to show up more often.

MSCS

A masters of science degree in computer science can mean two very different things:

  1. The research MS where the student works closely with an advisor on a research project that culminates in a thesis and ideally a few papers. This kind of student is generally paid as a TA or RA and can be expected to be a significantly stronger writer, researcher, and public speaker after finishing the program. Most of these students have a bachelors degree in CS. It is understood that any strong research MS student will be encouraged to pursue a PhD.
  2. The coursework-only MS is best viewed as an extension of the undergraduate degree. The student pays tuition and does not work closely with a professor. A significant number of these students do not have a BS degree in CS, but rather are using the MS as an opportunity to get up to speed and to become credentialed in a field that has strong job opportunities. These students emerge from the program knowing more than they did before, but they do not generally get much training in public speaking or writing and they do not generally leave the program with the added degree of maturity that we see in MS students who have finished a thesis. It is not expected that these students will continue on for a PhD.

Some CS departments specialize in one of these, others (such as mine) support both.

Anyway, here’s the thing, folks. The large number of coursework-only MS degrees that we are collectively granting is eroding whatever prestige and credentialing value was previously associated with an MS. To put it bluntly, people have caught on. For example, see this short essay by Aline Lerner (or, see it at Forbes if Quora’s stupid modal popup annoys you). The punchline is:

Whereas MS degrees used to be a means for departments to begin vetting future PhD students, I believe that the purpose has, in some ways, shifted to be a cash cow for the university in question.

The thesis is that job applicants with MS degrees are often weaker than those with BS degrees. I see similar effects here at Utah where I’m often the instructor for a mixed graduate and undergraduate operating systems course. The best MS students and the best undergrads are extremely strong. However, the median-quality MS student is weaker than the median-quality undergrad. A lot of this is caused by the MS students who don’t have a CS background: they simply are not ready for a serious upper-division CS class. I used to really hate that we mix the grads and undergrads in a single course but at some level it’s not that bad since it avoids the problem Aline alludes to where the grad-level course can easily end up being weaker than the undergrad version because it (often mistakenly) assumes that the students already understand the fundamentals.

The “cash cow” comment refers to programs where the revenue from an MS program exceed the costs. University finances are hard to untangle but it sounds perfectly plausible that many course-based MS programs are indeed being used as cash cows. Of course the extra money does not turn into profits; rather, it is used to shore up financial weaknesses elsewhere. The problem with the cash cow model is that it creates an incentive to maximize the number of students in the program, and this is clearly not always in the students’ best interests.

Last week, while talking to an undergrad who intends to pursue an MS, I realized that probably none of what we’re talking about here is apparent to the average student applying to MS programs. So I advised her to use the following strategy:

  1. Get some solid research experience as an undergrad.
  2. Apply to a large number of MS programs.
  3. Throw away every acceptance letter that doesn’t offer funding, and consider throwing away each one that offers just a TA.
  4. Choose an appropriate school from the remaining acceptance letters.

This will, I think, give her good odds of landing at a place where she can get some real advising rather than just taking more classes.

Of course I can’t foresee how this all will play out, but it is certainly interesting that Udacity and Georgia Tech are making a grab for a much larger share of the course-only MS market with their recently-announced $7000 online MSCS. If this works out for them, it’s going to cause a real thinning out of the cash cows.

UPDATE:

There’s lots of good discussion on HN and I want to respond to, and add, a few points:

  • A number of people have mentioned situations in which a course-based MS makes sense: (1) your employer values it and (2) the classes are useful. Of course these are absolutely valid reasons to get a course-based MS, provided that the value exceeds the costs in terms of your time and money.
  • Wilfra mentions “disdain for people who don’t have a BS CS who want to get an MS CS.” It was certainly not my intent to express disdain for people who follow this path! Many people pull it off admirably. However, it is a difficult path to follow because it skips all of the introductory programming classes. This ends up getting some people in over their heads. There are many ways to avoid that problem such as programming on your own, taking MOOCs, etc.
  • At many institutions, the responsibility for not getting in over one’s head is placed largely on the students. While this is in some ways admirable, it does lead to some amount of unnecessary suffering.
  • One way to understand an MS program is to see how carefully it is firewalled off from the BS and PhD programs. The more isolated it is, the more likely it is to be a cash cow program. Also, the more degrees are granted each year, the more likely it is to be a cash cow.
  • Matt Welsh notes in comment 12 that international students often use an MS program as a gateway to getting a PhD in the USA. Also, it is used as a step towards working here.

Thanks for the discussion.

Computer Science Culture Clash

It’s not uncommon for an empirical CS researcher to get a review saying something like “Sure, these results look good, but we need to reject the paper since the authors never proved anything about the worst case.” Similarly, when I interviewed for faculty jobs ten years ago, a moderately famous professor spent a while grilling me about the worst-case performance of a static analysis tool that I had written. This was, to me, an extremely uninteresting topic but luckily there’s an easy answer for that particular class of tool. I recall noticing that he did not seem particularly interested in what the tool did, or if it was actually useful.

Problems like these are yet another consequence of the deep and wide divide between the math and engineering sides of computer science. In caricature form, the two approaches work like this:

  • An engineer sees that characterizing the worst-case performance of a lawnmower is a fool’s errand, and focuses instead on work that will enable the creation of more effective and less expensive mowers.
  • A mathematician also sees that there’s no way to make any guarantees about the worst-case performance of a lawnmower, and responds by creating an abstract model of mowing that is easier to do proofs about: “Assume an infinite supply of gasoline, a frictionless engine, and a circular lawn of unit radius…”

The problem occurs when these people need to evaluate each others’ work. Both are likely to turn up their noses.

My own view is that guarantees are a wonderful thing. For example, the purpose of the static analysis tool that I was grilled about while interviewing was to make guarantees about embedded systems (just to be clear: I am very interested in guarantees about embedded systems, and wholly uninterested in guarantees about tools that analyze embedded systems). However, we need to keep in mind that guarantees (and their friends, impossibility results) are mathematical statements that abstract away a lot of real-world detail. Of course, abstraction is at the core of computer science and even the most hard-nosed engineers among us must abstract away details in order to accomplish anything. But some abstractions are good — they capture important details while dropping irrelevant clutter — whereas others drop something essential.

One of my favorite stories where the mathematical side of CS rescued the engineers comes from the early development of optimizing compilers. The basic idea behind optimization is to repeatedly replace part of the program with a faster piece of code that is functionally equivalent. The problem is that without a good definition of “functionally equivalent” some optimizations are going to break some programs and when this happens we have no good basis for determining whether it is the program or the compiler that is at fault. As an example, we might ask if it is OK for an optimizing compiler to remove a delay loop from a program. Clearly this will break programs that rely on the delay. Anyhow, as I understand the history, throughout the 1950s and 1960s this basic ambiguity did not deter the engineers from creating some nice optimizing compilers but there were persistent low-grade problems in figuring out which transformations were legal. Subsequently, people were able to develop theories — dataflow analysis and abstract interpretation — relating program semantics and compiler optimizations in such a way that it became possible to figure out whether any particular optimization is valid or not. These theories are concrete and pervasive enough that, for example, the string “lattice” appears about 250 times in the LLVM source code.

The point is that both the engineering and math sides can benefit from keeping the dialog open. Thus, I’ve tried to come up with a few recommendations for different situations we might find ourselves in.

When submitting papers:

  • Is there anything to worry about? If you’re an engineer submitting to SOSP, or a mathematician submitting to STOC, all reviews will be written by your own kind of people, so there’s no problem.
  • Engineers: Your first impression will likely be that the mathematical side has nothing to say about your work because you are solving real-world problems. This may be true or it may not be, but in any case don’t be ignorant. Read and understand and cite relevant work and describe how it relates to your research.
  • Mathematicians: The engineers want something that is, or could be, useful. If your work advances the state of the art in a way that could be transitioned into practice, say so and provide some evidence. In other words, help the reviewers understand why your work is of interest to people who are not mathematicians.
  • Make your paper’s contributions extremely clear, and moreover make it clear what kind of contributions they are. The second part is difficult and important. Then, regardless of how clear you have been, be ready for the engineers / mathematicians to fail to see the value in the work.
  • Be honest about your contributions. If you’re really an engineer, you’re probably on very shaky ground when proposing a fundamental new model of computation. If you’re really a mathematician, don’t start your paper with a page about making life better for users and then spend the remaining nine pages proving lemmas.
  • Know when to admit defeat. There are some communities that are simply not going to accept your papers reliably enough that you can make a home there. If you don’t want to change your style of work, you’ll need to take your business elsewhere. It’s their loss.

When reviewing papers:

  • Don’t dismiss the work out of hand just because it fails to attack any problem that you care about. Try to understand what the authors are trying to accomplish, and try to evaluate the paper on those terms. This is hard because they will often not state the goals or contributions in a way that makes sense or seems interesting to you.
  • Mathematicians: If your criticism is going to be something like “but you failed to prove a lower bound,” think carefully about whether the authors’ stated contributions would be strengthened by a lower bound, or rather if it is simply you who would like to work on the lower bound.
  • Engineers: If your criticism is going to be something like “you failed to evaluate your work on 10,000 machines,” think carefully about whether this makes sense to say. If the authors’ model has rough edges, can they be fixed or are they fundamentally at odds with the real world? Are there any insights in this paper that you or another engineer could use in your own work?
  • Mathematicians: If your criticism is going to be something like “this is a trivial application of known techniques,” be careful because the gap between knowing something and making it work is often wide. Also, it is a success, rather than a failure, if the engineers were able to readily adapt a result from the mathematical side. Of course, if the adaptation really is trivial then call them on it.
  • Engineers: If your criticism is going to be something like “this seems needlessly complicated” try to keep in mind that the penalty for complexity in math is less (or at least it affects us differently) than in engineering. Often the added complexity is used to get at a result that would be otherwise unattainable. Of course, if the complexity really is useless, call them on it.
  • If you are truly not qualified to review the paper, try to wiggle out of it. Often an editor or PC chair will be able to swap out the paper — but only if they are asked early enough. At the very least, clearly indicate that your review is low-confidence.

I hope that I’ve made it clear that close interaction between the mathematicians and engineers is an opportunity rather than a problem, provided that each side respects and pays attention to the other. I should add that I’ve made mistakes where there was a piece of theoretical CS research that I thought was useless, that turned out not to be. Since then I’ve tried to be a little more humble about this sort of thing.

In this piece I’ve sort of pretended that engineers and mathematicians are separate people. Of course this is not true: most of the best people can operate in either mode, at least to a limited extent.

Foothill Sunset

I went for a hike last night to celebrate being out from under whatever virus made me more or less sick for most of the last month.

The foothill wildflowers are more subdued than the ones that will cover the big mountains in July and August.

Four mountain ranges and the Great Salt Lake.

Looking down at the university.

I stopped a little below 7000′ and sat on a rock for 45 minutes waiting for sunset.

It was full night by the time I got home; should have brought a headlamp!

Procedural Decomposition

While teaching a CS class I spend quite a bit of time looking over the shoulders of students whose code doesn’t work. Sometimes they have a simple mistake and I’ll either point it out or ask a question that will lead them to the problem. However, other times the code is just generally not very good and I’ve always sort of struggled to find the right things to say; it’s hard because it’s usually a matter of judgement and taste.

Recently I’ve started to notice a pattern where students who are having trouble getting their code to work often have not done enough procedural decomposition. That is, they have failed to appropriately break up a large, difficult programming problem into a collection of smaller ones. You are probably thinking to yourself that this is pretty basic stuff, and perhaps it is when we’re solving easy problems. On the other hand, I’ve come to believe that procedural decomposition for hard problems is probably one of those skills that we can keep getting better at for as long as we care to keep writing code.

An Example

In my operating systems class this spring, I asked the students to do a bit of hacking on Linux’s driver for the Minix filesystem. The Minix FS is a reasonable starting point because it’s about as simple as realistic filesystems get. The assignment was to change the code so that instead of representing a file as a tree of disk block numbers, a file is represented as a list of extents. An extent is a collection of contiguous disk blocks that can be represented as a (base, length) pair. For purposes of this assignment, we make the unrealistic assumption that indirect extents are not needed.

The crux of this assignment was rewriting the Minix FS driver’s get_block() function, which finds a block of a file on the disk and then maps it into the block cache; this is nontrivial because the requested block must be created if it doesn’t already exist (and a “create” flag is set). My first attempt at get_block() — as well as many students’ first attempts — was a monolithic function. However, the logic turns out to be just hairy enough that this is difficult to get right. On the other hand, the job can be broken down into several pieces:

  • tiny utility functions for packing and unpacking the (base, length) pairs that are used to represent extents on disk
  • a lookup function that finds a block of a file in the extent list if it exists, and fails otherwise
  • an “extend file” function that makes a file bigger either by one block or by a specified number of blocks

Once these are available, get_block() needs only about a dozen lines of code. The tricky thing isn’t figuring out this particular decomposition — which is kind of obvious — but rather recognizing that it’s time to think about breaking up the problem rather than attacking it all at once. The temptation to keep banging on a function that seems to be almost correct is strong.

What’s Going On?

I asked some colleagues: Are we teaching procedural decomposition? And are we doing a good job? Well, of course we are teaching this and in fact one of the people who does that here, Matthew Flatt, is an author of How To Design Programs, one of the best books I know of for teaching abstraction and decomposition (and it’s free). But even so, the fact remains that in upper-division classes I’m getting a good number of students who are not performing procedural decomposition even in a situation where it is clearly necessary.

Since I’ve started thinking about procedural decomposition more, I’ve noticed that I don’t do enough of it either. When I catch myself writing poorly abstracted code, I try to analyze what I’m thinking and usually it boils down to a mixture of laziness (“if I can just get this stupid function to work, I won’t have to figure out how to break it up into pieces”) and overconfidence in my ability to deal with a complex case analysis all at once. I’m guessing that similar things are going on in students’ heads.

The purpose of the rest of this piece is to take a closer look at procedural decomposition and how we might be able to do a better job teaching it. But first, a quick note about terminology: the thing I’m calling procedural decomposition is also sometimes referred to as procedural abstraction, functional abstraction, or functional decomposition.

Reuse Isn’t the Main Thing

I think a lot of us were taught that the main reason we break code into functions is to facilitate reuse. In this view, we primarily write quicksort as a separate function so that we can call it from multiple locations in our program. It’s usually not too hard to notice when this kind of decomposition is necessary: we just get tired of writing quicksort from scratch after the second or third time. In contrast, procedural decomposition of a single problem is more difficult because it’s not as obvious where to introduce abstraction.

Decomposition and Problem Difficulty

I sometimes notice myself doing a much nicer job of procedural decomposition when solving simple problems than when solving complex ones. Of course if I were programming rationally it would be the other way around. Perhaps this effect can be understood by saying that I have a fixed amount of brain power to devote to a programming problem. In this case, when I’m totally overwhelmed by complexity I have no resources left for the added burden of figuring out how to best perform procedural decomposition. Of course, once I step back, delete some code, and find a good decomposition, I can surmount some sort of complexity barrier and probably find a nice landscape of simple code on the other side.

Years ago I heard Randy Pausch gave a talk describing how people who tried to fly a virtual magic carpet were overwhelmed by the many degrees of freedom offered by a prototype version of the controls. In order to simplify the situation, they left the carpet’s throttle at the maximum setting. This reduced the degrees of freedom by one, but with the unfortunate side effect of causing them to careen stupidly through the game running into everything. It seems like this might be a good analogy for how many of us approach a difficult programming problem.

Decomposition and Language Choice

There’s clearly a link between procedural decomposition and programming language choice. For example, I learned to program in early microcomputer BASIC dialects (TI BASIC, GW-BASIC, and Applesoft BASIC) where the support for procedural abstraction was not even as good as it is in assembly language. I sometimes fear that spending a few formative years writing BASIC may have stunted my capacity for abstraction.

One of the things that made the old BASIC implementations such turds is that they had no support for creating multiple namespaces. C is better, but the lack of nested functions and multiple return values can make it hard to perform elegant procedural decomposition there. C++ and Java don’t support nested functions either, but the object model seems to approximate them well enough for practical purposes. Modern languages with tuples, nested functions, and first-class functions seem to offer the best support for procedural decomposition.

Historically, we were discouraged from performing too much procedural decomposition because function calls added overhead. C/C++ programmers have been known to write horrific macros in order to get decomposition without increasing runtime. The modern wisdom — regardless of programming language — is that any performance-oriented compiler will do a pretty good job inlining functions that are small and that don’t have a ton of call sites. When the compiler has good dynamic information, such as a JIT compiler or an AOT compiler with access to profile data, it can do a considerably better job. However, even an AOT compiler without profile data can often do pretty well given good heuristics and perhaps some hints from developers. In any case, the days of allowing performance considerations to influence our procedural decomposition are mostly over.

Teaching Procedural Decomposition

I think that too often, our programming assignments are so canned that a solid grasp of procedural abstraction isn’t necessary to solve them. Moreover, we tend to not do a great job of grading on code quality — and of course the students are allowed to throw away their code after handing it in, so they have little incentive to do real software engineering. Of course there are plenty of students who write very good code because they can, and out of a general sense of pride in their work. But they’re not the ones I’m talking about trying to help here.

Maybe something we need to do is help students learn to recognize the danger signs where the complexity of a piece of code is starting to get out of control. Personally, I get sort of a panicky feeling where I become more and more certain that each fix that I add is breaking something else. I hate this so much that it’s making my skin prickle just writing about it. Usually, the best thing to do when this happens is step back and rethink the structure of the code, and probably also delete a bunch of what’s already written.

Finally, we probably need to do more code reading in classes. This is difficult or impossible for a large class, but probably can be done well with up to 10 or 15 students. As a random example, the Minix FS module for Linux kernel contains a rather nice decomposition of the functionality for managing the block tree for each file.