Software Testing Using Function/Inverse Pairs

One of the hardest problems in software testing is finding convenient, strong oracles—programs that can tell us whether the software under test worked correctly. Weak oracles, such as checking to make sure a program doesn’t crash, are always available, but not always of much use. An example of a strong oracle would be a reference implementation.

Sometimes software comes in function/inverse pairs, meaning that we get a program implementing some function f and another that implements f^-1. When this happens we can (ideally) make up an input x and then check to make sure that f^-1(f(x))=x. Often this works only in one direction. For example, let’s consider an assembler/disassembler pair. If I take some arbitrary assembly language, assemble it, and then disassemble the resulting object code, I’m unlikely to get exactly what I started with. On the other hand, if I take some arbitrary object code, disassemble it, and then assemble the resulting assembly code, I should get back the same object code. The question is whether or not f or f^-1 is prepared to accept inputs in non-canonical forms. This problem can usually be solved by running a second cycle through the system. For example, Jesse Ruderman of jsfunfuzz fame reports taking JavaScript code in non-canonical form, compiling and decompiling it, and then doing the same thing again. If the second and third versions of the program differ, a bug has been found. I don’t know how many of the impressive 1500+ bugs found by jsfunfuzz come from this method.

Here’s a list of common function/inverse pairs that I came up with:

pickle/unpickle, or save/load, of in-memory data
checkpoint/restore of process or virtual machine state
assemble/disassemble and compile/decompile
encode/decode of data for transmission
encrypt/decrypt
compress/decompress, either lossless or lossy

The thing I would like to learn from you readers is what other function/inverse pairs you can think of. I feel like I must be missing many.

This topic was suggested by Alastair Reid who additionally suggested that an assembler/disassembler pair for an architecture such as ARM could be exhaustively tested in this fashion since there are only 2³² possible instruction words. Very cool!

May 17, 2012

regehr

Computer Science, Software Correctness

12 responses to “Software Testing Using Function/Inverse Pairs”

ja says:

May 17, 2012 at 7:59 pm

Parse/pretty print
Bill Harris says:

May 17, 2012 at 10:07 pm

Suppose that you’re testing a static analysis tool. For some abstract domains, you might be able to test that the concretization function inverts the abstraction function.

Similarly, for some semantics, you might be able to test implementations of predicate transformers, in particular weakest precondition and strongest postcondition.
Martin Jambon says:

May 18, 2012 at 12:25 am

The following come to mind:

invert/invert: matrices and other numerical structures

send/receive: testing client-server systems

translate/translate back: natural languages (lossy!)

wrap/unwrap: packets, furniture, eBay merchandise, etc.

build/clean up: making sure that all the products of some operation are under control (make/make clean)

shuffle/sort

distribute/merge

capture/model: build 3D models from photos and such
Derek Jones says:

May 18, 2012 at 8:03 am

The sounds-like tool works by converting letter sequences to phonemes and back to letter sequences again. The phoneme to letter sequence conversion is often not unique, so there is no fixed point.

phonetisaurus does text to phoneme conversion using a Markov model and I had an email discussion with the author who automatically inverted the model to produce a phoneme to text conversion model.
Dominik says:

May 18, 2012 at 9:29 am

alloc/dealloc (or release)
create/destroy
kill/resurrect
fork/join
regehr says:

May 18, 2012 at 12:37 pm

Thanks folks, this is good stuff.
Alex says:

May 18, 2012 at 1:29 pm

Any signal processing or packet processing is likely to have an inverse.

These are bit tricky because forward error correction has a tendency to hide bugs!
Alastair Reid says:

May 21, 2012 at 1:11 am

There’s a general mathematical framework for this called a (Galois) reflection which describes two languages S,T with:
– An ordering on equivalent terms such that a<= b if a is more canonical than b
– Transformations f from S to T and g from T to S such that f(g(t)) = t and g(f(s)) <= s

I've used this in papers on visual tracking which involves functions from images to models (i.e., take a camera image and locate a target image within it) and from models to images (i.e., scale, rotate and translate the target image to generate a prediction of what the camera image should look like). We mostly used the reflection model for guidance when designing new trackers because it turns out that the moment you put a video camera or a robot in the loop, you get more fuzz testing than you can deal with!

The best description I've seen for reflections is by Phil Wadler (section 3 of http://homepages.inf.ed.ac.uk/wadler/topics/call-by-need.html#reflection) who related it to decompilation.
Alastair Reid says:

May 21, 2012 at 1:13 am

John,

The idea of exhaustively testing all 2^32 opcodes is more than just a theoretical – it is part of overnight testing of our assembler/disassembler here at ARM.

We also exhaustively test LLVM’s MC layer against our disassembler. See the MC Hammer slides (http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf) from the European LLVM conference.
Slide 27 is especially interesting: 18% of ARM instructions were being incorrectly assembled by LLVM’s MC layer.
Magnus says:

May 21, 2012 at 9:15 pm

Financial institutions often calculate rates or volatilities of assets from the prices paid in the market for derivatives based on the assets.

E.g. price of an option = X:
X –> volatility calculator –> volatility = V
Then with any luck
volatility = V –> pricing model –> price = X
Flash Sheridan says:

May 22, 2012 at 10:28 pm

Inverses are useful, of course, for testing the math part of a tool chain, as are more general identities, though you may need a fudge factor. I found a bunch of bugs via random identity-based testing in the PalmSource ARM compiler, e.g., in our variant of the FDLIBM hypotf. I mention this briefly in my _Software: Practice And Experience_ article, though without any of the useful details. (Which were pretty straightforward, as I recallâ€”just set up the obvious identities, find an idle machine(s), and turn it loose.)
Lex Spoon says:

May 28, 2012 at 8:49 am

View update in OLAP applications

The user interface displays a value (e.g. total sales for all stores in 2011) that is calculated out of base data (sales for each individual store). The user can specify a new target value, and the system is expected to update the base data so that recomputing the formula will yield the value the user specified.