API Fuzzing vs. File Fuzzing: A Cautionary Tale

Libraries that provide APIs should be rock solid, and so should file parsers. Although we can use fuzzing to ensure the solidity of both kinds of software, there are some big differences in how we do that.

A file parser should be fully robust: it isn’t allowed to crash even if presented with a corrupted file or the wrong kind of file.	Although a library that implements an API usually does some validation of arguments, this is generally a debugging aid rather than an attempt to withstand hostility.
A file usually originates outside of a trust boundary.	There generally is not a trust boundary between a library and the program that uses the library.
A file parser can perform arbitrary checks over its inputs.	A C/C++ library providing an API cannot check some properties of its input. For example, it cannot check that a pointer refers to valid storage. Moreover, APIs are often so much in the critical path that full error checking would not be desirable even if it were possible.

For fuzzing campaigns the point is this:

The sole concern of a file fuzzer is to generate interesting files that expose bugs in the file parser or in subsequent logic.

An API fuzzer must balance two goals. Like the file fuzzer, it needs to expose bugs by using the API in interesting ways. However, at the same time, it must stay within the usage prescribed by the API — careful reading of the documentation is required, as well as a bit of finesse and good taste.

As a concrete example, let’s look at libpng, which provides both a file parser and an API. To fuzz the file parsing side, we generate evil pngish files using something like afl-fuzz and we wait for the library to do something wrong. But what about fuzzing the API? We’ll need to look at the documentation first, where we find text like this:

To write a PNG file using the simplified API: 1) Declare a 'png_image' structure on the stack and memset() it to all zero. 2) Initialize the members of the structure that describe the image, setting the 'format' member to the format of the image in memory. 3) Call the appropriate png_image_write... function with a pointer to the image to write the PNG data.

Hopefully it is immediately obvious that calling random API functions and passing random crap to them is not going to work. This will, of course, cause libpng to crash, but the crashes will not be interesting because they will not come from valid usage of the library.

You might ask why anyone would care to fuzz an API. Aren’t trust boundaries the things that matter? The answer is easy: If you are looking to find or prevent exploitable vulnerabilities, you should always focus on fuzzing at trust boundaries. On the other hand, if you are more generally interested in reliable software, you should fuzz APIs as well. We might use API fuzzing to ensure that printf prints floats correctly and that the red-black tree you coded up late last night doesn’t spuriously drop elements.

Now let’s look at a blog post by GDS from last week about fuzzing mbed TLS. There’s something kind of interesting going on here: they are doing API fuzzing (of the mbed TLS library) but they are doing it using afl-fuzz: a file fuzzer. This works well because afl-fuzz provides the data that is “sent” between the client and server (sent is in quotes because this fuzzing effort gains speed and determinism by putting the client and server in the same process).

At the bottom of the post we read this:

Our fuzzing scans discovered client-side NULL pointer dereference vulnerabilities in mbed TLS which affect all maintained versions of mbed TLS. ARM fixed the issues in versions 2.1.1, 1.3.13 and PolarSSL version 1.2.16. See the release notes for details.

However, the corresponding text in the release notes is this:

Fabian Foerg of Gotham Digital Science found a possible client-side NULL pointer dereference, using the AFL Fuzzer. This dereference can only occur when misusing the API, although a fix has still been implemented.

In other words, the blog post author disagrees with the mbed TLS maintainers about whether a bug has been discovered or not. It is a matter of debate whether or not this kind of crash represents a bug to be fixed. I would tend to agree with the mbed TLS maintainers’ decision to err on the side of defensive programming here. But was there a vulnerability? That seems like a stretch.

In summary, API fuzzing is different from file fuzzing. Good API fuzzing requires exploration of every legal corner of the API and no illegal corners. Ambiguous situations will come up, necessitating judgement calls and perhaps even discussions with the providers of the API.

September 28, 2015

regehr

Computer Science, Software Correctness

8 responses to “API Fuzzing vs. File Fuzzing: A Cautionary Tale”

Manuel says:

September 28, 2015 at 2:38 pm

[Disclaimer: I’m a developer of mbed TLS, but this post only represents my technical and personal opinion, as opposed to any kind of official statement by my employer.]

As much as I agree with you about the important differences between API fuzzing and file fuzzing, I would tend to think that what GDS was doing here is file fuzzing, not API fuzzing. Indeed they are using afl-fuzz to generate invalid inputs to throw at mbed TLS’s handshake message handling functions. Those functions should be (and hopefully are) “fully robust” as per your definition in the left column of the first table, as their input comes form outside a trust boundary.

Where the API question comes into play, is that users of the API are supposed to stop the handshake when the handshaking function returns an error code (except one that indicates that mbed TLS is waiting for I/O), but GDS Labs’ test application doesn’t do that. More specifically, the break statement on line 780 of selftls-2.0 [1] should not only break out of the current for loop, but also out of the larger do/while loop (lines 760-819), as should the break statement on line 806.

[1]: https://github.com/GDSSecurity/mbedtls-fuzz/blob/master/selftls-2.0.c#L780

So the NULL dereference is triggered by a combination of (1) the peer sending invalid data, and (2) the application not aborting the hanshake on error as it’s supposed to. We choose to patch our code to avoid that possible NULL dereference, but applications that use the API correctly were never vulnerable to this particular issue.

Note that in the next future, we plan to kill this entire class of issues by remembering that an error happened, so that even if the application tries to continue the hanshake, we will refuse to do so until the application resets the context to start a new handshake from a clean state.
regehr says:

September 28, 2015 at 4:21 pm

Hi Manuel, thanks for commenting. I think your analysis agrees with mine. Remembering the error in the library state seems like a good idea.
Jesse Ruderman says:

September 28, 2015 at 4:28 pm

This post should be titled “Fuzzing within trust boundaries”.

Web browsers provide many examples of APIs that are exposed to malicious input. The clearest examples are the numerous DOM APIs. These are exposed directly to JavaScript except for a thin IDL layer that ensures arguments are of the correct type.

Many “internal” APIs have input heavily influenced by web content as well. The 2d graphics library, for example, ultimately gets most of its input values (and call order to some extent) from web content. It would be great if these libraries made it clear what is considered valid input and had separate test shells.

Meanwhile, browsers trust some files. A Firefox profile contains at least 5 sqlite databases and 3 NSS databases. The HTTP cache uses some kind of binary format. And then there’s a file called “startupCache” used in such a low-level way that it must be invalidated if you switch to an architecture with different word size or endianness.

It might make sense to fuzz some of these files, if only to ensure that corruption doesn’t result in Firefox being unable to launch, but I’ve been focusing on attack surface instead.
Magnus says:

September 28, 2015 at 9:06 pm

There’s a good blog post by Andrzej KrzemieÅ„ski at https://akrzemi1.wordpress.com/2015/01/12/defensive-programming/ which links to an excellent talk by John Lakos about defensive programming. Slides for Lakos’s talk are at https://github.com/CppCon/CppCon2014/raw/master/Presentations/Defensive%20Programming%20Done%20Right/Defensive%20Programming%20Done%20Right%20-%20John%20Lakos%20-%20CppCon%202014.pdf

The tl;dr is that API implementations that check their input increase the scope of their contract. Debug versions can (and should) assert, but release versions should probably plead undefined behaviour.

A contract that promises to return ‘blah’ when ‘whatever’, and an error code if this pointer is null or that number is out of range, makes the job of the kind of analysis tools that interest you (John) much harder (many more valid code paths). Contracts like this are also hard to put right, because once the API is released code can be written that relies on errors being returned on bad input.
kme says:

September 28, 2015 at 9:14 pm

Another example of an API that’s at a trust boundary is an OS kernel system call API. Trinity, for example, fuzzes the Linux kernel in a nearly exhaustive fashion (only blacklisting a few things that cause the fuzzing processes to suicide too easily, like exit(2)).
regehr says:

September 29, 2015 at 2:26 am

Jesse, thanks — you are right about the title but I wanted to keep things concrete.
regehr says:

September 29, 2015 at 2:34 am

Magnus and kme, thanks!
Ben says:

September 29, 2015 at 7:24 am

This post got me thinking about the connection between API fuzzing and QuickCheck. In my experience many C/C++ programmers tend to use very imprecise types (void *) and then do lots of casting and punning and bit stealing and other nonsense (in the most affectionate sense of the word). If APIs used the type system to say more about the shape of their inputs and outputs, API fuzzers should be able to leverage that to be much more efficient.

It also reminded me of a project that I spent a very small amount of time sketching out a few years ago to use static bug finders on APIs. Instead of finding a concrete example of how a bug could arise, the idea was to take all the potential bugs at the API boundary and somehow figure out which represented bugs in the API implementation and which represented documentation of the API.