Random Testing Gets No Respect

A funny thing about random testing is that it gets little respect from most computer scientists. Here’s why:

Random testing is simple, and peer reviewers hate simple solutions.
Random testing does not lend itself to formalization and proofs, which peer reviewers love.
Random testing provides no guarantees. Peer reviewers love guarantees — regardless of whether the they are actually useful.
Random testing is hard to do well, but the difficulty is more art than science: it requires intuition, good taste, and domain knowledge. These — like many other important things — can’t be neatly packaged up in a journal paper.
People dislike the probabilistic aspect, and would prefer a fixed test suite.
Most researchers don’t produce robust software, and therefore have trouble understanding why anyone would need testing techniques that can help produce robust software by uncovering difficult corner-case bugs.

Of course, random testing has plenty of well-known problems, but I often get the feeling that people are responding to prejudices rather than facts. Even so, the picture is far from bleak and there are plenty of good researchers who exploit or at least appreciate random testing. My favorite recent example is found in this paper by Fox and Myreen, who are working on a formal model of the ARM instruction set architecture in the HOL4 theorem prover. This kind of work is the most pedantic, hard-nosed possible kind of formal methods, and yet they validate their model (and find bugs in it) using humble randomized differential testing — against real ARM boards, no less.

February 12, 2011

regehr

Computer Science, Software Correctness

17 responses to “Random Testing Gets No Respect”

Chung-chieh Shan says:

February 12, 2011 at 2:51 pm

Of course, over in theory, random testing gets plenty of respect, with formal proofs and probabilistic guarantees and all that. Different fields, different prejudices?
Carlos Scheidegger says:

February 12, 2011 at 2:52 pm

Amen!

We have a paper in the pipeline which uses random testing for some topological properties of isosurfacing algorihtms, and we had a really rough time convincing people how damned useful randomized property testing is. “But where’s the theorem?” Nevermind that it caught bugs that had been lurking in published code, from algorithms which had been proven correct 🙂
regehr says:

February 12, 2011 at 3:47 pm

Thanks for the story Carlos! Each of my random testing papers has had about one reviewer who made a serious attempt to kill the paper.
regehr says:

February 12, 2011 at 3:49 pm

Chung-chieh, maybe you can send me a few pointers to good papers?

I’m pretty sure that plain old random testing can be bootstrapped into useful guarantees. I just don’t know how to do it yet :).
Tweets that mention Embedded in Academia : Random Testing Gets No Respect — Topsy.com says:

February 12, 2011 at 4:30 pm

[…] This post was mentioned on Twitter by Y Combinator Newest! and newsery3, John Regehr. John Regehr said: Random Testing Gets No Respect: A funny thing about random testing is that it gets little respect from most comp… http://bit.ly/f8Ih4s […]
Suresh says:

February 12, 2011 at 10:16 pm

But wait a second. I remember you constantly complaining at me a while back insisting that randomized methods can never catch all the bugs in a program.
regehr says:

February 12, 2011 at 10:41 pm

Suresh that’s true but I don’t see what it has to do with the post?
Suresh says:

February 13, 2011 at 3:03 am

then I’m confused. Isn’t that what you’re saying reviewers complain about ?
regehr says:

February 13, 2011 at 7:46 am

Nope, I’m saying in many cases reviewers have a generalized prejudice against random testing. Usually there isn’t any way to catch all bugs.
Alastair Reid says:

February 13, 2011 at 1:59 pm

I think what I really like to get from a paper is some insight – but, unless the author does a good job, a paper that uses random testing, genetic algorithms, brute force, etc. can be very effective at solving a problem but give little or no insight. And even if there is insight, it may be so narrow that I don’t care. I think it takes a lot of work to write a paper on random testing that makes people care.

Which is not to say that random testing is a bad technique. To help us experiment with an experimental (big) extension the ARM instruction set, we’ve been generating assemblers, disassemblers, LLVM instruction selection, simulators, etc. and we’ve caught lots of bugs and inconsistencies between the tools using random testing. And, of course, there’s lots of random testing (and directed testing) of ARM processors.
regehr says:

February 13, 2011 at 3:23 pm

So basically a good random testing paper is one where we learn something, and where that something is (at least somewhat) generally applicable. Sounds right!
Suresh says:

February 14, 2011 at 12:09 am

I think Alastair’s point is a good one and applies generally as well. Papers should yield insight as well as solutions. There’s a famous note by Bill Thurston (I think) to this effect about proofs in mathematics as well.
Chung-chieh Shan says:

February 14, 2011 at 5:38 pm

I did a Google search for “focs OR stoc random testing” and was reminded of a nice explanation of a cool and potentially useful result http://rjlipton.wordpress.com/2010/06/03/an-amplification-trick-and-stoc-2010/ followed by a bunch of other references that looked relevant.
Ranjit says:

February 14, 2011 at 5:38 pm

Hi John,

I am rather surprised that you think Random Testing gets no respect.
QuickCheck a really clever randomized testing technique (or perhaps
the word is methodology), won the “most influential paper” award for
ICFP 2000.

http://www.sigplan.org/award-icfp.htm

There are also papers on how one can, with a few random tests,
prove properties about programs:

http://www.springerlink.com/content/118266422372l6g2/

Best,

Ranjit.
regehr says:

February 14, 2011 at 10:49 pm

Thanks for the references Chung-chieh and Ranjit!

Ranjit, in this post I’m reacting to a lot of conversations I’ve had with intelligent computer scientists who simply do not accept that random testing can or should be part of the quality assurance package for software systems. I’ve heard the same weak objections over and over, but certainly this attitude is not universal. Quickcheck is an excellent piece of work and we need more tools like it.
Ranjit says:

February 15, 2011 at 6:52 pm

Hi John, I bet those computer scientists don’t write a lot of code 🙂 Ranjit.
Ben L. Titzer says:

February 16, 2011 at 1:07 pm

I’d feel more comfortable with random testing if the random test cases produced were always automatically added to a growing regression suite. Thus random tests from the past could be repeated with the current version of the software. Otherwise the possibility for Heisenbugs (and the reverse, Heisenpasses) is just too great.