Squeezing a CS Research Idea

[Disclaimer: I wrote about this topic once before, but it was buried in a longer piece. Also, it was in one of the first posts on this blog and I don’t think anyone read it. Update: I checked the stats. Nobody read it.]

A hammer I like to use when reviewing papers and PhD proposals is one that (lacking a good name) I call the “squeeze technique” and it applies to research that optimizes something. To squeeze an idea you ask:

  1. How much of the benefit can be attained without the new idea?
  2. If the new idea succeeds wildly, how much benefit can be attained?
  3. How large is the gap between these two?

Though this is simple, surprisingly often people have failed to work it out for themselves. Let’s go through an example and then get back to the details.

In 2004 Markus Mock wrote a clever paper which squeezes the idea of programmer specified aliasing, as specified by the C99 language. The “restrict” qualifier added in that version of the language serves as a hint from developers to the compiler; it means that the qualified pointer will not be aliased by other pointers, permitting better code generation in some situations.

Mock’s idea was to instrument a collection of benchmarks to record actual pointer behavior and then use this information to add the maximum possible number of “restrict” qualifiers. In other words, he used dynamic information to optimistically add a restrict qualifier in every location where this was permissible for the inputs used during benchmarking. He then re-ran the modified benchmarks and found less than 1% average speedup.

This result means that for the given compilers and benchmarks, no conceivable human intervention or static pointer analysis can achieve more than 1% average speedup through the addition of restrict qualifiers. The conclusion was that since “restrict” is dangerous in practice (misplaced qualifiers will cause incorrect code to be generated) programmer-specified aliasing is not generally a good idea.

Squeezing a research idea has three elements. The first — left implicit in the list above — is to select a metric. In general this should be externally meaningful: something that people other than computer scientists would care about, perhaps because it has economic consequences. For example speedup, throughput, MTTF, or energy consumption might make a good metric (when coupled with good workloads, a good test platform, good experiment design, etc.). In contrast, the number of “restrict” qualifiers, the average size of an alias set, and the number of context switches are probably poor top-level metrics because they enjoy (at best) an indirect relationship with something a person on the street might be worried about.

The second important choice is the baseline: the amount of benefit achieved without the current idea. This can be tricky and there are many ways of going wrong (probably baseline selection deserves its own blog post). A pernicious mistake is to choose a straw man: an obviously primitive or defective baseline. For example, maybe I’ve created a highly parallelizable algorithm for coalescing hoovulators. Then, I choose the single-processor version of my algorithm as the baseline and go on to show that I can achieve near-linear scalability up to 512 processors. This is a great result unless I am ignoring Stickleback’s 1983 algorithm that is non-parallelizable but performs the same job 750 times faster than my single-processor version, on the same hardware.

The third important element of the squeeze technique is determining an upper bound on improvement. In many cases — as Mock showed — a little creativity is all that is required. Other examples:

  • Amdahl’s Law bounds parallel speedup
  • The power consumption of the disk, display, and memory often bound the increase in mobile device lifetime that can be attained by reducing processor power usage
  • Time spent copying data can bound throughput improvements in networking
  • Time spent traveling at 0.3c can bound latency improvements in networking
  • CPU, memory, or I/O saturation can bound improvements in throughput due to better resource scheduling

In summary, no matter what we are trying to optimize, there is probably a component that we can improve and a component that we cannot. Understanding the relative magnitudes of these components is a powerful tool for helping to select a research problem to work on. I find myself extremely reluctant to sign off on a document like a PhD proposal that does not display evidence of this sort of thinking.

Iterating Over The Full Range

Every couple of months I need to write code that runs for every value of, for example, a char. This is usually to exhaustively test something so the code should be fast, so I write it in C. At first I always want to write:

for (c = CHAR_MIN; c <= CHAR_MAX; c++) {
  use (c);

This is very wrong when c has type “char.” Declaring c as “int” is one solution, and here’s another:

do {
  use (c);
} while (c++ != CHAR_MAX);

There is no signed overflow because the char is promoted to int before being incremented. However, neither solution can iterate over all values of an int. It would be possible to cast to unsigned before incrementing (ugly) or use a larger type than int (not supported by all compilers, and not fast on some platforms). Thus I usually end up with something like this:

i = INT_MIN;
while (1) {
  use (i);
  if (i == INT_MAX) break;

But this is super clunky. Anyone have a better idiom for this case?

Downtown SLC Without a Car

People who visit Salt Lake City for a meeting or something often end up staying downtown, and often don’t bother renting a car. The other day I was mailing a friend a list of things to do in this situation. Here it is, cleaned up a bit:

  • Sam Weller’s Bookstore is on Main St between 200 S and 300 S; it’s a great store (be sure to check out all three levels) and has a good coffee shop with free wireless
  • Ken Sanders bookstore at 268 South on 200 East is another of my favorites
  • The taco stand on State St just north of 800 S (in the Sears parking lot) is awesome; $4 gets you four carne asada tacos plus a coke; catty-corner from here is the Epic Brewery which makes some good products (not a brewpub — just fridges full of bottles)
  • City Creek Canyon is a very short walk from downtown; the paved road is a good place to walk; the total length of the canyon is 14.5 miles and there are plenty of side trails, so there’s a lot to do here; the upper part of this canyon is seldom visited and quite wild; even-numbered days are most pleasant for walking the road since bikes are not allowed (directions)
  • The main Salt Lake library is a neat building, it’s on 400 S between 200 E and 300 E; if you take the elevators to the top floor, the big sloping ramp that descends back to ground level has nice views of the city and mountains
  • Salt Lake Roasting Company on 400 S between 300 E and 400 E (just east of the library) has great coffee and pastries
  • The light rail system provides easy access to some other parts of the city
  • Lots of restaurants can be found along 300 South between Main St and 300 W
  • Some good bars are scattered throughout; the best beer selection is found at Bayou and Beerhive (but the food isn’t that great at either place)

This list is heavily biased towards what I would do given a free day in SLC. Hopefully it’s useful.

Size Matters

Not long ago I served on an NSF panel: a collection of researchers who look at some grant proposals and provide recommendations to NSF program officers who then make decisions about which, if any, of the proposals to fund. After the panel finished, the program manager asked us for feedback about various issues including the question: How large should NSF awards be? To put the question in context, in the programs that fund my kind of research, NSF offers “small” awards of up to $500,000 over up to three years, “medium” awards of up to $1.2M over up to four years, and “large” awards of up to $3M over up to five years. Few large awards are given and the bar for getting these is very high. Most grant proposals in a given size category ask for close to the maximum amount of money. For a given amount of grant money, should NSF (and other agencies) give more small awards or few large ones? I’ll present various arguments and give my opinion at the end.

  • Most grant money in computer science in the US pays PhD students around $20k per year to be “research assistants” working not only towards their degrees, but also towards larger research goals (however, in terms of grant money, the cost can be upwards of $50k per year due to university overhead and other factors). A problem with short-duration grants is that they cannot support PhD students for their entire lifetime in the system, requiring multiple grants to be pieced together.
  • Small grants cause significant administrative overhead at the NSF (running panels, negotiating budgets, etc.). Large grants reduce NSF overheads but push this burden out into the universities who now have to deal with more complicated accounting and reporting.
  • In Europe a lot of the money has moved towards very large grants. From what I hear, this necessitates a lot of jockeying for position, forming coalitions with the correct geographical distribution, etc. Nobody from Europe who I’ve talked to loves this or thinks it is very effective.
  • It can be difficult for small institutions to compete for large grants.
  • Small grants (unless accompanied by higher hit rates) force researchers to spend more time writing proposals, reducing the amount of real work they get done.
  • Large grants favor the status quo: they most often go to groups who have previously received large grants. Often, these groups belong to research centers and institutes that have evolved very sophisticated machinery for acquiring large grants and could not survive without them.
  • Some work is almost impossible to do using small grants. For example, expensive equipment or large software implementation efforts may be required to make a project succeed.
  • Large grants may engender collaboration, which can be beneficial. On the other hand, collaboration has overhead. Furthermore, some fraction of large grants support what I call “false collaboration” which happens when researchers get together and acquire a large grant, but then the individuals just go off and work on whatever they want, separately.

Clearly some sort of balance is needed. My guess is that most funding, maybe 80%, should go towards a large number of small awards, with the rest supporting carefully-chosen larger efforts.

Externally Relevant Open Problems in Computer Science

Most academic fields have some externally relevant problems: problems whose solutions are interesting or useful to people who are totally ignorant of, and uninterested in, the field itself. For example, even if I don’t want to know anything about virology, I would still find a cure for the common cold to be an excellent thing. Even if I failed calculus I will find it interesting when physicists invent a room-temperature superconductor or figure out why the universe exists.

This piece is about computer science’s externally relevant open problems. I have several reasons for exploring this topic. First, it’s part of my ongoing effort to figure out which problems matter most, so I can work on them. Second, I review a lot of proposals and papers and often have a strong gut reaction that a particular piece of work is either useful or useless. An idea I wanted to explore is that a piece of research is useless precisely when it has no transitive bearing on any externally relevant open problem.

A piece of work has “transitive bearing” on a problem if it may plausibly make the problem easier to solve. Thus, an esoteric bit of theoretical physics may be totally irrelevant or it may be the key to room temperature superconductivity. Of course, nobody thinks they’re doing irrelevant work. Nevertheless, it’s hard to escape the conclusion that a lot of CS (and other) research is in fact irrelevant. My third motivation for writing this piece is that I think everyone should do this sort of analysis on their own, rather than just believing the accepted wisdom about which research topics are worth working on. The analysis is important because the accepted wisdom is — at best — a lagging indicator.

Below is my list. I’ll be interested to see what people think I’ve missed (but note that I’m deliberately leaving out problems like P vs. NP which I don’t think are externally relevant).

Giving More People Meaningful Control Over Computation

There are plenty of people — scientists, analysts, managers, etc. — who are not programmers but who would benefit from gaining better control over computers. I’m using the phrase “meaningful control over computation” as a bit of a hedge because it’s clear that most of these people don’t have 2-5 years to spare in which to become competent programmers. The goal is to give people the power that they need to solve their problems without miring them in the Turing tarpit. A good example is the class of spreadsheet programming languages which expose a useful kind of computation without most of the problems traditionally associated with programming. Overall, this problem is maybe 1/3 programming language design, 1/3 user interface design, and 1/3 domain specific.

Trustworthy Automation

Can we create computer systems that do what we want them to do? This problem encompasses both security and reliability. It’s a good problem to work on because solutions have not only have short-term economic benefit but in the long run they directly support getting humans off the rock, which as I’ve said before is something we need to be working very hard on.

The whole problem of specifying what we want systems to do is enormously hard, and in fact we generally have no precise ideas about this. Even highly mathematical objects like compilers are most commonly specified in human language, if at all. Moreover, the programming languages that we use to specify computations contain elements such as floating point computations, fixed-width integers, and excessive use of side effects — all of which seem designed to impede reasoning.

Intelligent Systems

Can computers be created that interface gracefully with humans? How can augmented cognition be used to sidestep limitations of humans and computers? Which sub-problems of AI are achievable and useful? Here I mean “intelligent” literally, not in the sense it is usually used, where it means “not as obviously bad as the previous thing.” Of course, the goal of AI may turn out to conflict with “trustworthy automation” but we’ll have to cross that bridge when we come to it.

Observing, Simulating, Modeling, and Predicting Everything

The universe produces a gigantic amount of data at scales from atoms to galaxies. Luckily, the theoretical bounds on the amount of computation that can be done using the available energy are very generous. The ugly little space heaters with which we currently compute have not even started to scratch the surface of what’s possible.

This one is pretty vague, but I’m not sure right now how to improve it.

Further Reading

Vernor Vinge and Hans Moravec have created some of the best unconstrained thinking about computers. Asimov was pretty good in his day, but Sladek nailed it. The transhumanist material on the web seems pretty bad, as is Kurzweil’s stuff. Dertouzos’s Unfinished Revolution was not inspiring. There must be more (good and bad) writing on this subject, please make suggestions.

Update from evening of 5/10: This list of external problems is of course subjective. I’d be very interested to see other people’s blog posts describing what they think CS has to offer the non-CS world.

What I Want From a Bibliography System

As a professor I spend a fair amount of time wrangling with references. Because it’s free and reasonably simple, I use BibTeX: an add-on tool for LaTeX that automates the construction of a bibliography by pulling references out of a separate text file, assigning them numbers (or other identifiers), and formatting the entries appropriately.

BibTeX entries are widely available on the web. In principle this is great, but in practice the quality of downloadable BibTeX entries is poor: they contain so many typos, omissions, misclassifications, and other problems that I no longer download them, but rather create a new entry from scratch, perhaps using the existing entry as a rough specification. Even BibTeX entries from the ACM’s Digital Library — a for-pay service provided by the premier professional society for computer science — are not directly usable, basically every one of them requires editing. Over time I’ve come to more and more blame BibTeX for these problems, rather than blaming the people creating the entries. Here are the design points I’d like to see in a bibliography system.

Style neutrality: A bibliography entry should be a collection of factual information about a document. The “bibliography style” — a collection of formatting guidelines specific to a particular journal or similar — should be a separate concern. BibTeX does support bibliography styles but the implementation is incomplete, forcing entries to be tweaked when the style is changed.

Minimal redundancy: Any large BibTeX file contains a substantial amount of redundancy, because BibTeX poorly supports the kind of abstraction that would permit, for example, the common parts of two papers appearing at the same conference to be factored out. Duplication is bad because it consumes time and also makes it basically impossible to avoid inconsistencies.

Standalone entries: I should be able to download a bibliography entry from the net and use it right away. This goal is met by BibTeX but only at the expense of massive redundancy. Meeting both goals at the same time may be tricky. One solution would be to follow an entry’s dependency chain when exporting it, in order to rip a minimal, self-sufficient collection of information out of the database. This seems awkward. A better answer is probably to rely on conventions: if there is a globally recognized name for each conference and journal, a self-sufficient entry can simply refer to it.

Plain-text entries: Putting the bibliography in XML or some random binary format isn’t acceptable; the text-only database format is one of the good things about BibTeX.

Network friendliness: It should be easy to grab bibliography entries from well-known sources on the network, such as arXiv. Assuming that the problem of these entries sucking can be solved, I should not even need a local bibliography file at all.

User friendliness: BibTeX suffers from weak error checking (solved somewhat by add-on tools like bibclean) and formatting difficulties. For example, BibTeX’s propensity for taking capitalized letters from the bib entry and making them lowercase in the output causes numerous bibliographies I read to contain proper nouns that are not capitalized. Messing with BibTeX styles is quite painful.

BibTeX is a decent tool, it just hasn’t aged or scaled well. CrossTeX and BibLaTeX sound nice and solve some of the problems I’ve mentioned here. However, since I have about 37,000 lines of BibTeX sitting around, I’m not interested in migrating to a new system until a clear winner emerges.

Publishing for Impact

Choices made in the process of doing research can lead to a 5-10x difference in number of publications for exactly the same underlying work. For example, do I publish my idea in a workshop, then a conference, and then a journal? On the other hand I can just do the conference version. Do I take the trouble to write up a paper about an MS thesis where the student has somehow managed to escape without giving me a paper? (The common tactic for accomplishing this is to get a great job offer with a short fuse, and then utterly disappear.) Do I incrementalize my work into a collection of neat little papers each based on a single idea, or do I glom it all together into monster papers that contain a half-dozen or more research contributions?

Today’s question is:

If my goal is simply to maximize my own research impact over my career, should I aim for the low end (1-3 papers per year) or the high end (ten times more) of this spectrum?

I’ve puzzled through this and have concluded that as long as we keep the goal simple — maximize impact — there are few arguments for publishing more.

The first problem with publishing more is that writing a paper requires significant effort over and above that of doing the research in the first place. I usually estimate the cost of writing a paper at one month, but it could be considerably more or less depending on a number of factors. Since teaching uses up about three months of my effort each year, grant proposals cost 1-3 months, and service and administration another three months, the loss of a month of research time is a serious matter. In a nutshell: since my time is finite, writing more papers means I get less research done. Over a 40-year career, the loss of productivity due to writing 5-10 extra papers per year would be huge.

The other big problem with publishing more is that my target audience consists of extremely busy people. Having impact means that people who matter need to read and understand the material that I produce. If I produce 20 papers per year, how many will be read carefully? Probably not that many. My own reading time is highly constrained, and therefore I have become strongly prejudiced against reading papers by authors who seem like paper machines, because all too often any individual paper is weak and/or it contains a lot of similar thinking to something I’ve already read by that author. On the other hand, there are other researchers who produce less material where I’ll drop everything to read their latest piece of work because it reliably teaches me something new. In a nutshell: producing watered-down content creates a disincentive for people to read my work.

The thing that surprised me, when I sat down and thought through these issues, was how one-sided the arguments against prolific publication seem to be. Of course I’m not unaware of the arguments in the other direction. First, a larger number of papers impresses people who primarily count. Second, publishing more benefits students by giving them practice writing and by building up their vitae. Third, prolific publishing in a wide variety of venues probably exposes a larger number of people to one’s work — though I’m not sure this factor is very important given the social, searchable web as it exists today.