Random Testing Gets No Respect


A funny thing about random testing is that it gets little respect from most computer scientists. Here’s why:

  • Random testing is simple, and peer reviewers hate simple solutions.
  • Random testing does not lend itself to formalization and proofs, which peer reviewers love.
  • Random testing provides no guarantees. Peer reviewers love guarantees — regardless of whether the they are actually useful.
  • Random testing is hard to do well, but the difficulty is more art than science: it requires intuition, good taste, and domain knowledge. These — like many other important things — can’t be neatly packaged up in a journal paper.
  • People dislike the probabilistic aspect, and would prefer a fixed test suite.
  • Most researchers don’t produce robust software, and therefore have trouble understanding why anyone would need testing techniques that can help produce robust software by uncovering difficult corner-case bugs.

Of course, random testing has plenty of well-known problems, but I often get the feeling that people are responding to prejudices rather than facts. Even so, the picture is far from bleak and there are plenty of good researchers who exploit or at least appreciate random testing. My favorite recent example is found in this paper by Fox and Myreen, who are working on a formal model of the ARM instruction set architecture in the HOL4 theorem prover. This kind of work is the most pedantic, hard-nosed possible kind of formal methods, and yet they validate their model (and find bugs in it) using humble randomized differential testing — against real ARM boards, no less.

,

17 responses to “Random Testing Gets No Respect”

  1. Of course, over in theory, random testing gets plenty of respect, with formal proofs and probabilistic guarantees and all that. Different fields, different prejudices?

  2. Amen!

    We have a paper in the pipeline which uses random testing for some topological properties of isosurfacing algorihtms, and we had a really rough time convincing people how damned useful randomized property testing is. “But where’s the theorem?” Nevermind that it caught bugs that had been lurking in published code, from algorithms which had been proven correct 🙂

  3. Thanks for the story Carlos! Each of my random testing papers has had about one reviewer who made a serious attempt to kill the paper.

  4. Chung-chieh, maybe you can send me a few pointers to good papers?

    I’m pretty sure that plain old random testing can be bootstrapped into useful guarantees. I just don’t know how to do it yet :).

  5. But wait a second. I remember you constantly complaining at me a while back insisting that randomized methods can never catch all the bugs in a program.

  6. Nope, I’m saying in many cases reviewers have a generalized prejudice against random testing. Usually there isn’t any way to catch all bugs.

  7. I think what I really like to get from a paper is some insight – but, unless the author does a good job, a paper that uses random testing, genetic algorithms, brute force, etc. can be very effective at solving a problem but give little or no insight. And even if there is insight, it may be so narrow that I don’t care. I think it takes a lot of work to write a paper on random testing that makes people care.

    Which is not to say that random testing is a bad technique. To help us experiment with an experimental (big) extension the ARM instruction set, we’ve been generating assemblers, disassemblers, LLVM instruction selection, simulators, etc. and we’ve caught lots of bugs and inconsistencies between the tools using random testing. And, of course, there’s lots of random testing (and directed testing) of ARM processors.

  8. So basically a good random testing paper is one where we learn something, and where that something is (at least somewhat) generally applicable. Sounds right!

  9. I think Alastair’s point is a good one and applies generally as well. Papers should yield insight as well as solutions. There’s a famous note by Bill Thurston (I think) to this effect about proofs in mathematics as well.

  10. Thanks for the references Chung-chieh and Ranjit!

    Ranjit, in this post I’m reacting to a lot of conversations I’ve had with intelligent computer scientists who simply do not accept that random testing can or should be part of the quality assurance package for software systems. I’ve heard the same weak objections over and over, but certainly this attitude is not universal. Quickcheck is an excellent piece of work and we need more tools like it.

  11. I’d feel more comfortable with random testing if the random test cases produced were always automatically added to a growing regression suite. Thus random tests from the past could be repeated with the current version of the software. Otherwise the possibility for Heisenbugs (and the reverse, Heisenpasses) is just too great.