Latency Numbers Every Professor Should Know

### Latency numbers every professor should know
    Email from student ............................ 20 sec
    Person at office door  ......................... 8 min
    Other interruption ............................ 20 min
    Twitter or something seems really important ... 45 min
    Anxiety about deadlines ........................ 1 hr
    A meeting ...................................... 2 hrs
    A meeting you forgot about ..................... 1 day
    A class to teach ............................... 2 days
    Request to review a paper ...................... 3 days
    Request to write evaluation letter ............. 6 days
    Stuff to grade ................................. 1 wk
    Unsolicited time management advice arrives ..... 2 wks
    Fire alarm clears building ..................... 3 wks
    Travel to conference ........................... 5 wks
    Paper deadline ................................. 6 wks
    Grades due .................................... 16 wks
    Grant proposals due ........................... 26 wks
    Summer ......................................... 1 yr
    Sabbatical ..................................... 7 yrs = 2.208e+17 ns

With apologies to the folks who published latency numbers every programmer should know.

Sabbatical at TrustInSoft

At the beginning of September I started at TrustInSoft, a Paris-based startup where I’ll be working for the next 10 months. I’ll post later about what I’m doing here, for now a bit about the company. TrustInSoft was founded by Pascal Cuoq, Fabrice Derepas, and Benjamin Monate: computer science researchers who (among others) created the Frama-C static analyzer for C code while working at CEA, the Atomic Energy Commission in France. They spun off a company whose business is guaranteeing the absence of undefined behavior bugs in C code, often deeply embedded software that needs to just work. Of course this is a mission that I believe in, and also I already know Pascal quite well and had worked with him.

The logistics of moving my family overseas for a year turned out to be more painful than I’d anticipated, but since we arrived it has been great. I’m super happy to be without a car, for example. Paris is amazing due to the density of bakeries and other shops, and a street right outside our apartment has a big open-air market three times a week. I’ve long enjoyed spending time in France and it has been fun to start brushing the dust off of my bad-but-sometimes-passable French, which was actually pretty good when I lived in Morocco in the 1980s. One of the funny things I’ve noticed is that even now, I have a much easier time understanding someone from North Africa than I do a random French person.

We spent August getting settled and being tourists, a few pics follow.

I know this bridge.

Hemingway talks about kids sailing these little boats in the Jardin du Luxembourg in the 1920s, I wonder how long it has been going on?

Thinnest building in Paris?

Dinner at one of the Ottolenghi restaurants in London with friends Edye and Alastair. I’ve been dreaming about eating at one of these for years!

Bletchley Park was super great to visit.

Small World and scotch is a good combination.

Back in Paris, one of the coolest art installations I’ve seen out at the Parc de la Villette.

I’m sort of obsessed with these spray-painted spiders that are all over the city.

Inexpensive CPU Monster

Rather than using the commercial cloud, my group tends to run day-to-day jobs on a tiny cluster of machines in my office and then to use Emulab when a serious amount of compute power is required. Recently I upgraded some nodes and thought I’d share the specs for the new machines on the off chance this will save others some time:

processor Intel Core i7-5820K $ 380
CPU cooler Cooler Master Hyper 212 EVO $ 35
mobo MSI X99S SLI Plus $ 180
RAM CORSAIR Vengeance LPX 16GB (4 x 4GB) DDR4 SDRAM 2400 $ 195
SSD SAMSUNG 850 Pro Series MZ-7KE256BW $ 140
video PNY Commercial Series VCG84DMS1D3SXPB-CG GeForce 8400 $ 49
case Corsair Carbide Series 200R $ 60
power supply Antec BP550 Plus 550W $ 60

These machines are well-balanced for the kind of work I do, obviously YMMV. The total cost is about $1100. These have been working very well with Ubuntu 14.04. They do a full build of LLVM in about 18 minutes, as opposed to 28 minutes for my previously-fastest machines that were based on the i7-4770. I’d be interested to hear where other research groups get their compute power — everything in the cloud? A mix of local and cloud resources? This is an area where I always feel like there’s plenty of room for improvement.

Inward vs. Outward Facing Research

One of the things I like to think about while watching research talks is whether the work faces inward or outward. Inward facing research is mostly concerned with itself. A paper that uses most of its length to prove a theorem would be an example, as would a paper about a new operating system that is mainly about the optimizations that permit the system to perform well. Outward facing research is less self-aware, it is more about how the piece of work fits into the world. For example, our mathematical paper could be made to face outwards by putting the proof into an appendix and instead discussing uses of the new result, or how it relates to previous work. The OS paper could demonstrate how users and applications will benefit from the new abstractions. Computer science tends to produce a mix of outward and inward facing research.

Next let’s turn to the question of whether a given paper or presentation should be inward or outward facing. This is subjective and contextual so we’ll do it using examples. First, the mathematical paper. If the proof is the central result and it gives us new insights into the problem, then of course all is as it should be. Similarly, if the operating system’s use case is obvious but the optimizations are not, and if performance is the critical concern, then again no problem. On the other hand, researchers have a tendency to face inward even when this is not justified. This is natural: we know more about our research’s internal workings than anyone else, we find it fascinating (or else we wouldn’t be doing it), we invent some new terminology and notation that we like and want to show off, etc. — in short, we get caught up in the internal issues that we spend most of our time thinking about. It becomes easy to lose track of which of these issues other people need to know about and which ones should have stayed in our research notebooks. Let’s say that we’re working on a new kind of higher-order term conflict analysis (just making this up, no offense to that community if it exists). One way to structure a paper about it would be to discuss the twists and turns we took while doing the work, to perform a detailed comparison of the five variants of the conflict analysis algorithm that we created, and to provide a proof that the analysis is sound. Alternatively, if the running time of the analysis isn’t actually that important, we could instead use some space demonstrating that a first-order analysis is wholly unsuitable for solving modern problems stemming from the big data revolution. Or, it might so happen that the analysis’s soundness is not the main concern, in which case we can use that space a better way.

I hope it is becoming clear that while some work is naturally inward facing and some outward facing, as researchers we can make choices about which direction our work faces. The point of this piece is that we should always at least consider making our work more outward facing. The cost would be that some of our inner research monologue never sees the light of day. The benefit is that perhaps we learn more about the world outside of our own work, helping others to understand its importance and helping ourselves choose more interesting and important problems to work on.

Reviewing Research Papers Efficiently

The conference system that we use in computer science guarantees that several times a year, each of us will need to review a lot of papers, sometimes more than 20, in a fairly short amount of time. In order to focus reviewing energy where it matters most, it helps to review efficiently. Here are some ideas on how to do that.

Significant efficiency can come from recognizing papers that are not deserving of a full review. A paper might fall into this category if it is:

  • way too long
  • obviously outside the scope of the conference
  • significantly incomplete, such as an experimental paper that lacks results
  • a duplicate of a paper submitted or published by the same or different authors
  • an aged resubmission of a many-times-rejected paper that, for example, has not been updated to reference any work done in the last 15 years

These papers can be marked as “reject” and the review then contains a brief, friendly explanation of the problem. If there is controversy about the paper it will be discussed, but the most common outcome is for each reviewer to independently reach the same conclusion, causing the paper to be dropped from consideration early. Certain program committee members actively bid on these papers in order to minimize their amount of reviewing work.

Every paper that passes the quick smoke test has to be read in its entirety. Or perhaps not… I usually skip the abstract of a paper while reviewing it (you would read the abstract when deciding whether or not to read the paper — but here that decision has already been made). Rather, I start out by reading the conclusion. This is helpful for a couple of reasons. First, the conclusion generally lacks the motivational part of the paper which can be superfluous when one is closely familiar with the research area. Second — and there’s no nice way to say this — I’ve found that authors are more truthful when writing conclusions than they are when writing introductions. Perhaps the problem is that the introduction is often written early on, in the hopeful phase of a research project. The conclusion, on the other hand, is generally written during the grim final days — or hours — of paper preparation when the machines have wound down to an idle and the graphs are all plotted. Also, I appreciate the tone of a conclusion, which usually includes some text like: “it has been shown that 41% of hoovulators can be subsumed by frambulators.” This gives us something specific to look for while reading the rest of the paper: evidence supporting that claim. In contrast, the introduction probably spends about a page waxing eloquent on the number of lives that are put at risk every day by the ad hoc and perhaps unsound nature of the hoovulator.

Alas, other than the abstract trick, there aren’t really any good shortcuts during the “reading the paper” phase of reviewing a paper. The next place to save time is on writing the review. The first way to do this is to keep good notes while reading, either in ink on the paper or in a text file. Generally, each such comment will turn into a sentence or two in the final review. Therefore, once you finish reading the paper, your main jobs are (1) to make up your mind about the recommendation and (2) to massage the notes into a legible and useful form. The second way to save time is to decide what kind of review you are writing. If the paper is strong then your review is a persuasive essay with the goal of getting the rest of the committee to accept it. In this case it is also useful to give detailed comments on the presentation: which graphs need to be tweaked, which sentences are awkward, etc. If the paper needs to be rejected, then the purpose of the review is to convince the committee of this and also to help the authors understand where they took a wrong turn. In this case, detailed feedback about the presentation is probably not that useful. Alternatively, many papers at top conferences seem to be a bit borderline, and in this case the job of the reviewer is to provide as much actionable advice as possible to the authors about how to improve the work — this will be useful regardless of whether the paper is accepted or rejected.

I hope it is clear that I am not trying to help reviewers spend less total time reviewing. Rather, by adopting efficient reviewing practices, we can spend our time where it matters most. My observation is that the amount of time that computer scientists spend writing paper reviews varies tremendously. Some people spend almost no time at all whereas others produce reviews that resemble novellas. The amazing people who produce these reviews should embarrass all of us into doing a better job.

Update: Also see Shriram’s excellent notes about reviewing papers.

A Guide to Better Scripty Code for Academics

[Suresh suggested that I write a piece about unit testing for scripty academic software, but the focus changed somewhat while I was writing it.]

Several kinds of software are produced at universities. At one extreme we have systems like Racket and ACL2 and HotCRP that are higher quality than most commercial software. Also see the ACM Software System Award winners (though not all of them came from academia). I wrote an earlier post about how hard it is to produce that kind of code.

This piece is about a different kind of code: the scripty stuff that supports research projects by running experiments, computing statistics, drawing graphs, and that sort of thing. Here are some common characteristics of this kind of code:

  • It is often written in several different programming languages; for example R for statistics, Matplotlib for pretty pictures, Perl for file processing, C/C++ for high performance, and plenty of shell scripts and makefiles to tie it all together. Code in different languages may interact through the filesystem and also it may interact directly.
  • It seldom has users outside of the research group that produced it, and consequently it usually embeds assumptions about its operating environment: OS and OS version, installed packages, directory structure, GPU model, cluster machine names, etc.
  • It is not usually explicitly tested, but rather it is tested through use.

The problem is that when there aren’t any obvious errors in the output, we tend to believe that this kind of code is correct. This isn’t good, and it causes many of us to have some legitimate anxiety about publishing incorrect results. In fact, I believe that incorrect results are published frequently (though many of the errors are harmless). So what can we do? Here’s a non-orthogonal list.

Never Ignore Problems

Few things in research are worse than discovering a major error way too late and then finding out that someone else had noticed the problem months earlier but didn’t say anything. For example we’ll be tracking down an issue and will find a comment in the code like this:

  # dude why do I have to mask off the high bits or else this segfaults???

Or, worse, there’s no comment and we have to discover the offending commit the hard way — by understanding it. In any case, at this point we pull out our hair and grind our teeth because if the bug had been tracked down instead of hacked around, there would have been huge savings in terms of time, energy, and maybe face. As a result of this kind of problem, most of us have trained ourselves to be hyper-sensitive to little signs that the code is horked. But this only works if all members of the group are onboard.

Go Out of Your Way to Find Problems

Failing to ignore problems is a very low bar. We also have to actively look for bugs in the code. The problem is that because human beings don’t like being bothered with little details such as code that does not work, our computing environments tend to hide problems by default. It is not uncommon for dynamically and weakly typed programming languages to (effectively) just make up crap when you do something wrong, and of course these languages are the glue that makes everything work. To some extent this can be worked around by turning on flags such as -Wall in gcc and use warningsuse strict; in Perl. Bugs that occur when crossing layers of the system, such as calling into a different language or invoking a subprocess, can be particularly tricky. My bash scripts became a lot less buggy once I discovered the -e option. Many languages have a lint-like tool and C/C++ have Valgrind and UBSan.

One really nice thing about scripty research code is that there’s usually no reason to recover from errors. Rather, all dials can be set to “fail early, fail fast” and then we interactively fix any problems that pop up.

The basic rule is that if your programming environment supports optional warnings and errors, turn them all on (and then maybe turn off the most annoying ones). This tends to have a gigantic payoff in terms of code quality relative to effort. Also, internal sanity checks and assertions are worth their weight in gold.

Fight Confirmation Bias

When doing science, we formulate and test hypotheses. Although we are supposed to be objective, objectivity is difficult, and there’s even a term for this. According to Wikipedia:

Confirmation bias is the tendency of people to favor information that confirms their beliefs or hypotheses.

Why is this such a serious problem? For one thing, academia attracts very smart people who are accustomed to being correct. Academia also attracts people who prefer to work in an environment where bad ideas do not lead to negative economic consequences, if you see what I mean. Also, our careers depend on having good ideas that get good results. So we need our ideas to be good ones — the incentives point to confirmation bias.

How can we fight confirmation bias? Well, most of us who have been working in the field for more than a few years can easily bring to mind a few examples where we felt like fools after making a basic mistake. This is helpful in maintaining a sense of humility and mild research paranoia. Another useful technique is to assume that the people doing previous work were intelligent, reasonable people: if implementing their ideas does not give good results, then maybe we should figure out what we did wrong. In contrast, it is easy to get into the mindset that the previous work wasn’t very good. Evidence of this kind of thinking can be seen in the dismissive related work sections that one often sees.

Write Unit Tests

Modern programming languages come with good unit testing frameworks and I’ve noticed that the better students tend to instinctively write unit tests when they can. In contrast, us old fogies grew up as programmers long before the current testing culture developed and we have a harder time getting ourselves to do this.

But does unit testing even make sense for scripty code? In many cases it clearly doesn’t. On the other hand, Suresh gives the example where they are comparing various versions of an algorithm; in such a situation we might be able to run various data sets through all versions of the algorithm and make sure their results are consistent with each other. In other situations we’re forced to re-implement a statistical test or some other piece of fairly standard code; these can often be unit tested using easy cases. Mathematical functions often have properties that support straightforward smoke tests. For example, a function that computes the mean or median of a list should compute the same value when fed the same list twice.

Write Random Testers

It is often the case that an API that can be unit tested can also be fuzzed. Two things are required: a test-case generator and an oracle. The test-case generator can do something easy like randomly shuffling or subsetting existing data sets or it can make up new data sets from scratch. The oracle decides whether the code being tested is behaving correctly. Oracles can be weak (looking for crashes) or strong (looking for correct behavior). Many modern programming languages have a QuickCheck-like tool which can make it easier to create a fuzzer. This blog post and this one talk about random testing (as do plenty of others, this being one of my favorite subjects).

Clean Up and Document After the Deadline

As the deadline gets closer, the code gets crappier, including the 12 special cases that are necessary to produce those weird graphs that reviewer 2 wants. Cleaning this up and also documenting how the graphs for the paper were produced is surely one of the best investments we could make with our time.

Better Tooling

Let’s take it as a given that we’re doing code reviews, using modern revision control, unit testing frameworks, static and dynamic analysis tools, etc. What other tool improvements do we want to see? Phil Guo’s thesis has several examples showing how research programming could be improved by tools support. There’s a lot of potential for additional good work here.

Summary

There are plenty of easy ways to make scripty research code better. The important thing is that the people who are building the code — usually students — are actually doing this stuff and that they are receiving proper supervision and encouragement from their supervisors.

Hints for Computer System Design

On the last day of my advanced OS course this spring we discussed one of my all-time favorite computer science papers: Butler Lampson’s Hints for Computer System Design. Why is it so good?

  • It’s hard-won advice. Designing systems is not easy — a person can spend a lifetime learning to do it well — and any head start we can get is useful.

  • The advice is broadly applicable and is basically correct, there aren’t really any places where I need to tell students “Yeah… ignore Section 3.5.”

  • There are many, many examples illustrating the hints. Some of them require a bit of historical knowledge (the paper is 30 years old) but by and large the examples stand the test of time.

  • It seems to me that a lot of Lampson’s hints were routinely ignored by the developers of large codes that we use today. I think the reason is pretty obvious: the massive increases in throughput and storage capacity over the last 30 years have permitted a great deal of sloppy code to be created. It’s nice to read a collection of clear thinking about how things could be done better.

Something that bums me out is that it’s now impossible to publish a paper like this at a top conference such as SOSP.

Research Advice from Alan Adler

Although I am a happy French press user, I enjoyed reading an article about Alan Adler and the AeroPress that showed up recently on Hacker News. In particular, I love Adler’s advice to inventors:

  1. Learn all you can about the science behind your invention.
  2. Scrupulously study the existing state of your idea by looking at current products and patents.
  3. Be willing to try things even if you aren’t too confident they’ll work. Sometimes you’ll get lucky.
  4. Try to be objective about the value of your invention. People get carried away with the thrill of inventing and waste good money pursuing something that doesn’t work any better than what’s already out there.
  5. You don’t need a patent in order to sell an invention. A patent is not a business license; it’s a permission to be the sole maker of product (even this is limited to 20 years).

Now notice that (disregarding the last suggestion) we can simply replace “invention” with “research project” and Adler’s suggestions become a great set of principles for doing research. I think #4 is particularly important: lacking the feedback that people in the private sector get from product sales (or not), us academics are particularly susceptible to falling in love with pretty ideas that don’t improve anything.

Reproducibility in Computer Systems Research

These results about reproducibility in CS have been the subject of lively discussion at Facebook and G+ lately. The question is, for 613 papers, can the associated software be located, compiled, and run? In contrast with something I often worry about — producing good software — the bar here is low, since even a system that runs might be without value and may not even support the results in the paper.

It is great that these researchers are doing this work. Probably everyone who has worked in applied CS research for any length of time has wanted to compare their results against previous research and been thwarted by crappy or unavailable software; certainly this has happened to me many times. The most entertaining part of the paper is Section 4.3, called “So, What Were Their Excuses? (Or, The Dog Ate My Program)”. The recommendations in Section 5 are also good, I think all CS grad students and faculty should read over this material. Anecdote 2 (at the very end of the paper) is also great.

On the other hand, it’s not too hard to quibble with the methods here. For example, one of my papers was placed in the irreproducible category with the reason of:

could not install: style, flex, GNU indent, LLVM/clang 3.2


It’s not clear why installing these dependencies was difficult, but whatever. In a Facebook thread a number of other people reported learning that they have been doing irreproducible research for similar reasons. The G+ thread also contains some strong opinions about, for example, this build failure.

Another thing we might take issue with is the fact that “reproducible research” usually means something entirely different than successfully compiling some code. Usually, we would say that research is reproducible if an independent team of researchers can repeat the experiment using information from the paper and end up with similar (or at least non-conflicting) results. For a CS paper, you would use the description in the paper to implement the system yourself, run the experiments, and then see if the results agree with those reported.

Although it’s not too hard to put some code on the web, producing code that will compile in 10 or 20 years is not an easy problem. A stable platform is needed. The most obvious one is x86, meaning that we should distribute VM images with our papers, which seems clunky and heavyweight but perhaps it is the right solution. Another reasonable choice is the Linux system call API, meaning that we should distribute statically linked executables or equivalent. Yet another choice (for research areas where this works) might be a particular version of Matlab or some similar language.

UPDATE: You can help watch the watchmen here.

What I Accomplished in Grad School

I often talk to students who are thinking about grad school. The advice I generally give is a dressed-up version of “Just do whatever the hell will make you happy.” But if we all had solid ideas about what would make us happy then, well, we’d probably be a lot more happy. Here’s a list of things that I actually accomplished in grad school. Most of these things did make me happy or at least were satisfying. Of course, I cannot know the extent to which these things would make other people happy, and I also cannot know whether I would have been happier with the things that I’d have accomplished if I hadn’t gone to grad school. Since I got a PhD 13 years ago and started the program 18.5 years ago (crap!) I have at least a modest amount of perspective at this point.

First, some work-related things.

  • I became pretty good at doing and evaluating research.

  • I started to become good at writing. When I arrived at grad school I was not a good writer. When I left, I was not good either, but at least I was on the way. Since 2001, every time I write something, I have been thankful that it’s not a PhD thesis.

  • I wrote a few pretty decent papers. None of them set the world afire, but none of them has been a source of embarrassment either.

  • I did some internships in industry and, along the way, learned a bit about how the real world works, if such a thing can be said to exist.

But really, the things in grad school that weren’t about work were better:

  • I read a lot of books, often several per week. I’m afraid that I’m going to have to get the kids out of the house and also retire if I want to reach that level again.

  • I found someone to spend the rest of my life with. This was the purest luck.

  • I made a number of friends who I am still close to, though we don’t talk nearly often enough. I doubt that I’ll ever have another group of friends as good as these.

  • I became quite good at disc golf.

  • I did a decent amount of programming for fun.

  • I avoided going into debt. In fact, the TA and RA stipends that I received in grad school felt like a lot of money compared to the ~$7000/year that I lived on as an undergrad.

There are a bunch of things that are important that I did not accomplish in grad school:

  • I failed to learn even rudimentary time management.

  • I did not develop good eating, drinking, sleeping, or exercise habits. When I graduated I was under the impression that my body could tolerate almost any sort of abuse.

  • I didn’t learn to choose good research topics, this took several more years.

  • I didn’t figure out what I wanted to do with my life.

I put this out there on the off chance that it might be useful for people who are thinking about grad school.