The Citation Telephone Game

(sorry, I have no good attribution for this image)

My kids often come home from school spouting crazy “facts” they’ve learned from classmates. It seems fundamentally human to repeat stories and, in the repeating, alter them—often unintentionally. Researchers do the same thing, and just this morning I was irritated to read an entirely inaccurate citation of one of my own papers. No doubt others have had similar feelings while reading my work.

The Leprechauns of Software Engineering, Laurent Bossavit’s in-progress ebook (or e-screed, perhaps), contains wonderfully detailed examples about how some well-known facts in the software engineering field such as “bugs get more expensive to fix as time passes” have pretty dubious origins. The pattern is that we start out with an original source that makes certain claims, hopefully based on empirical evidence. Subsequent papers, however, tend to drop details or qualifications, using citations to support claims that, over time, diverge more and more from those in the original paper. In science, these details and qualifications matter: just because a fact is true under certain circumstances does not mean that it generalizes. Worse, the fact may not even be true in its original form due to statistical issues, flaws in experiment design, and similar. Complicating matters more, Bossavit seems to be finding cases where the slant introduced during citation is self-serving.

One story in Leprechauns made me laugh out loud: Bossavit was lecturing his class on a particular piece of well-known software engineering lore and realized halfway through that he wasn’t sure if or why what he was saying was true, or if he was making sense at all. Something similar has happened to me many times.

Although Leprechauns takes all of its examples from the software engineering field, I have no doubt that something similar could be written about any research area where empirical results are important. Bossavit’s overall message is that the standards for science need to be set higher. In particular, authors must read and understand a paper before citing it. Of course this should be done, but it’s not a total solution to the telephone game. As I think I’ve pointed out here before, the actual contribution of a research paper is often different from the claimed contribution. Or, to put it another way, we first need to understand what the authors intended to say (often this is not easy) and then we also need to understand what was left unsaid. A subtle reading of a paper may require a lot of background work, including reading books and papers that were not cited in the original.

Trilobite Day Trip

A great thing about kids is they provide an excuse to read a book aloud, make popsicles, spend an afternoon skipping rocks, or hike up a random mountainside to look for fossils. So when Ben—a fountain of knowledge about remote and little-known Utah attractions—recently posted about visiting a trilobite-bearing outcrop of Spence shale in the Bear River mountains, I knew what our next family road trip would be.

The Bear River Range is one of the 50 or so mountain ranges in Utah where I’ve never spent much time; it sits to the east of Logan, which is itself about a 1.5 hour drive from Salt Lake City. The Spence shale outcrops are a half-hour hike up a ridge, starting from the end of the High Creek road, a popular trailhead and camping area. No trails go up the ridge, but it’s easy (if steep) cross-country travel.

Looking for fossils can be frustrating but that was not at all the case here; there were tons of trilobites, though mostly fragments or pretty small ones. We’ve been fossil hunting enough times now to have a bit of an idea how to go about it. The basic process is to find a piece of rock and split it along natural fracture lines into thin pieces. Then, inspect the newly exposed surfaces and repeat. There is no way to do this without getting extremely dirty. Sunglasses are necessary since rock chips always head directly for the eyes, and gloves are good too. I have a regular mason’s hammer but it’s pretty sharp so I found some small ball-peen hammers for the kids to use.

Overall this was a great day outing. It’s a little awe-inspiring to unearth creatures that last saw the light half a billion years ago.

The Hidden Cost of Compiler Bugs

I have a hypothesis that compiler bugs impose a noticeable but hard-to-measure tax on software development. I’m not talking so much about compiler crashes, although they are annoying, but more about cases where an optimization or code generation bug causes a program to incorrectly segfault or generate a wrong result. Generally, when looking at a test case that triggers one of these bugs, I can’t find any reason why analogous code could not be embedded in some large application. As a random example, earlier this year a development version of GCC would miscompile this code when a and b have initial value 0:

b = (~a | 0 >= 0) & 0x98685255F;
printf ("%d\n", b < 0);

The code looks stupid, but such things can easily arise out of macro expansion, constant propagation, function inlining, or a hundred other common transformations. Compiler bugs that are exposed by inlining are pernicious because they may not be triggered during separate compilation, and therefore may not be detectable during unit testing.

The question is: What happens when your code triggers a bug like this? Keep in mind that the symptom is simply that the code misbehaves when the optimization options are changed. In a C/C++ program, by far the most likely cause for this kind of effect is not a compiler bug, but rather one or more undefined behaviors being executed by the program being developed (in the presence of undefined behavior, the compiler is not obligated to behave consistently).

Since tools for detecting undefined behaviors are sorely lacking, at this point debugging the problem usually degenerates to a depressing process of adding break/watchpoints, adding print statements, and messing with the code. My suspicion (and my own experience, both personal and gained by looking over the shoulders of many years’ worth of students) is that much of the time, the buggy behavior goes away as the result of an incidental change, without being fully understood. Thus, the compiler bug goes undetected and lives to bite us another day.

I’m not hopeful that we can estimate the true cost of compiler bugs, but we can reduce the cost through better compiler v&v and by better tools for detecting undefined behaviors.

Utah Eye Candy

There are several reasons that I sometimes post outdoor pictures here. First, I like pretty things and hope that other people do as well. Second, it seems reasonable to break up an otherwise monotonous flow of picture-free text about undefined behavior and compiler bugs. Third, I’m not above doing a bit of not-subtle PR work for Utah, which has an image problem. Generally speaking I don’t care (in fact, the porcupine effect benefits all of us who live here) but it rubs me the wrong way when we have a hard time recruiting grad students and faculty. Think about it folks: if you live here you can hike or ski in mountains like this every single day before or after work. Utah CS will be recruiting faculty (and, of course, grad students) this year. Everyone has to make their own work/life tradeoff, but for a lot of us who live here these mountains and deserts make a big difference in quality of life.

These pictures are from a not-quite-successful climb of White Baldy, considered to be one of the hardest 11,000 foot (3350 m) peaks in the Wasatch. Unsuccessful since I chickened out just under the summit due to scary scrambling. I probably would have been fine with it if I hadn’t been hiking alone and if I were in a bit better shape.

Core Question

[This post is about machines used by people. I realize things are different in the server room.]

We had one core per socket for a long time. When multi-cores came along, dual core seemed pretty awkward: real concurrency was possible, but with speedup bounded above by two, there wasn’t much point doing anything trickier than “make -j2”. Except in low-end machines two cores seems to have been a passing phase. Now, several years later, it is possible to buy desktop processors with six or eight cores, but they do not seem to be very common or popular. However, I will definitely spend some time working for a 4x speedup, so stalling there may not be such a shame. Even some inexpensive tablets are quad core now. But are we just pausing at four cores for another year or two, or is this going to be a stable sweet spot? If we are stuck at four, there should be a reason. A few random guesses:

  • Desktop workloads seldom benefit much from more than four cores.
  • Going past four cores puts too much of a squeeze on the number of transistors available for cache memory.
  • Above four cores, DRAM becomes a significant bottleneck.
  • Above four cores, operating systems run into scalability problems.

None of these limitations is fundamental, so perhaps in a few years four cores will be low-end and most workstations will be 16 or 32?

Cyber War

I recently read Richard Clarke’s Cyber War. Although I didn’t learn anything new on the technical side, that isn’t the focus of the book. Clarke’s main agenda is to build awareness of the uniquely vulnerable position that the United States finds itself in as well as proposing national policies that might lead to a more favorable position for the USA as well as a more stable situation for everyone. Although I know next to nothing about Clarke, over the course of the book I learned to admire his blunt opinions and the broad perspective he has developed as a long-time Washington insider. This book is a couple of years old, and therefore misses out on recent developments such as Stuxnet. Even so, I’m not aware of a better high-level introduction to the policy issues. It’s worth reading as a way to understand some of the broader implications of computer (in)security.