Value Loss Coverage


The ugly reality of computer arithmetic on fixed-length bit vectors is that many operations are not equivalent to the mathematical operators they superficially resemble. For example, only a tiny fraction of the possible outcomes of a 32-bit multiplication can be represented in a 32-bit destination register. Although bignum and rational packages exist — and have existed for many years — there are unsolved problems (highly unpredictable runtime and memory allocation behavior) that preclude their use in important classes of applications such as embedded systems that must execute in a predictable fashion, probably on constrained hardware.

The canonical example of harmful value loss was an assignment from a floating point value representing velocity into a 16-bit integer onboard Ariane 5 flight 501. The loss of value initiated a chain of events that resulted in the destruction of the rocket.

Modern compilers warn about potential value loss, which is nice because it makes it easier for us to inspect these code sites. But even so, it is often extremely hard to figure out which of these sites can lead to actual value loss. For example, the Ariane software could not lead to value loss on the Ariane 4, but value loss did occur when the same software module was reused on the Ariane 5. Clearly we cannot expect the compiler to understand the maximum horizontal velocity of a rocket.

The idea behind this piece is that for purposes of test coverage, every operation that potentially results in value loss should be tested in both its lossy and non-lossy modes. Consider these two C functions:

char foo1 (int x) {
  return x;
}
signed char foo2 (unsigned char x) {
  return x;
}

foo1() has value loss when x is outside the range of values representable in a char. We can fully test it by passing it the values 0 and 128 (assuming CHAR_MAX is 127). foo2() does not have a bitwidth constraint like foo1(), but it still can lose values because the value ranges representable in a signed and unsigned char do not fully overlap. foo2(), also, can be fully covered by passing it the values 0 and 128.

As a software testing goal, high value-loss coverage may only make sense when a whitebox test case generator like Klee is available. My student Peng Li implemented a patch to LLVM’s Clang frontend that takes a C program and compiles it into a form where the normal and lossy paths are distinct. These instrumented codes can be passed to Klee, which attempts to generate inputs that induce path coverage on the program. This works well for small test codes. However, we don’t yet have results from real applications showing that this extra coverage is useful for exposing bugs.

,

3 responses to “Value Loss Coverage”

  1. Great idea! If it looks promising with larger applications, I hope you will release the code so others can benefit. That’d be a lovely public service!

    (Unfortunately, if it’s maintained as a patch to Clang, it will probably bitrot over time. That’s a shame. But I suppose that’s how research goes.)

  2. By the way, I can recommend the following papers, which explore a similar (or at least closely related) idea:

    P. Godefroid, M.Y. Levin, and D. Molnar. “Active Property Checking.” EMSOFT 2008.
    http://research.microsoft.com/en-us/um/people/pg/public_psfiles/emsoft2008.pdf

    D. Molnar, X.C. Li, D.A. Wagner. “Dynamic Test Generation To Find Integer Bugs in x86 Binary Linux Programs”. Usenix Security 2009. http://www.usenix.org/events/sec09/tech/full_papers/molnar.pdf

  3. Hi SG, in fact Clang is evolving so rapidly that patches tend to rot quite quickly. But we can expect that to slow down. Anyway our hope is that if this turns out to be useful, the patch will get sucked into mainline Clang.

    Thanks for the links– I’ve read related papers by these authors, but have not read either of those two papers.