Today in my Writing Solid Code class we went through some of the 151 defects that Coverity Scan reports for OpenSSL. I can’t link to these results but take my word for it that they are a pleasure to read — the interface clearly explains each flaw and the reasoning that leads up to it, even across multiple function calls. Some of the problems were slightly alarming but we didn’t see anything that looks like the next heartbleed. Unfortunately, Coverity did not find the heartbleed bug itself. (UPDATE: See this followup post.) This is puzzling; here’s a bit of speculation about what might be going on. There are basically two ways to find heartbleed using static analysis (here I’ll assume you’re familiar with the bug; if not, this post is useful). First, a taint analysis should be able to observe that two bytes from an untrusted source find their way into the length argument of a memcpy() call. This is clearly undesirable. The Coverity documentation indicates that it taints the buffer stored to by a read() system call (unfortunately you will need to login to Coverity Scan before you can see this). So why don’t we get a defect report? One guess is that since the data buffer is behind two pointer dereferences (the access is via s->s3->rrec.data), the scanner’s alias analysis loses track of what is going on. Another guess is that two bytes of tainted data are not enough to trigger an alarm. Only someone familiar with the Coverity implementation can say for sure what is going on — the tool is highly sophisticated not just in its static analysis but also in its defect ranking system.
The other kind of static analysis that would find heartbleed is one that insists that the length argument to any memcpy() call does not exceed the size of either the source or destination buffer. Frama-C in value analysis mode is a tool that can do this. It is sound, meaning that it will not stop complaining until it can prove that certain defect classes are not present, and as such it requires far more handholding than does Coverity, which is designed to unsoundly analyze huge quantities of code. To use Frama-C, we would make sure that its own header files are included instead of the system headers. In one of those files we would find a model for memcpy():
/*@ requires \valid(((char*)dest)+(0..n - 1)); @ requires \valid_read(((char*)src)+(0..n - 1)); @ requires \separated(((char *)dest)+(0..n-1),((char *)src)+(0..n-1)); @ assigns ((char*)dest)<sup class='footnote'><a href='#fn-1125-1' id='fnref-1125-1' onclick='return fdfootnote_show(1125)'>1</a></sup> \from ((char*)src)<sup class='footnote'><a href='#fn-1125-2' id='fnref-1125-2' onclick='return fdfootnote_show(1125)'>2</a></sup>; @ assigns \result \from dest; @ ensures memcmp((char*)dest,(char*)src,n) == 0; @ ensures \result == dest; @*/ extern void *memcpy(void *restrict dest, const void *restrict src, size_t n);
The comments are consumed by Frama-C. Basically they say that src and dest are pointers to valid storage of at least the required size, that the buffers do not overlap (recall that memcpy() has undefined behavior when called with overlapping regions), that it moves data in the proper direction, that the return value is dest, and that a subsequent memcmp() of the two regions will return zero.
The Frama-C value analyzer tracks an integer variable using an interval: a representation of the smallest and largest value that the integer could contain at some program point. Upon reaching the problematic memcpy() call in t1_lib.c, the value of payload is in the interval 3. This interval comes from the n2s() macro which turns two arbitrary-valued bytes from the client into an unsigned int:
#define n2s(c,s) ((s=(((unsigned int)(c))<< 8)| \ (((unsigned int)(c)) )),c+=2)
The dest argument of this memcpy() turns out to be large enough. However, the source buffer is way too small. Frama-C would gripe about this and it would not shut up until the bug was really fixed.
How much effort would be required to push OpenSSL through Frama-C? I don’t know, but wouldn’t plan on getting this done in a week or three. Interestingly, a company spun off by Frama-C developers has recently used the tool to create a version of PolarSSL that they promise is immune to CWEs 119, 120, 121, 122, 123, 124, 125, 126, 127, 369, 415, 416, 457, 562, and 690. I think it would be reasonable for the open source security community to start thinking more seriously about what this kind of tool can do for us.
- In a comment below, Masklinn states that OpenSSL’s custom allocators would defeat the detection of the too-large argument to memcpy(). This is indeed a danger. To avoid it, as part of applying Frama-C to OpenSSL, the custom malloc/free functions would be marked as being malloc-like using the “allocates” and “frees” keywords supported by ACSL 1.8. Coverity lets you do the same thing, and so would any competent analyzer for C. Custom allocators are regrettably common in oldish C code.
- I’m interested to see what other tools have to say about heartbleed. If you have a link to results, please put it in a comment.
- I re-ran Coverity after disabling OpenSSL’s custom freelist and also hacking CRYPTO_malloc() and friends to just directly call the obvious function from the malloc family. This caused Coverity to report 173 new defects: mostly use-after-free and resource leaks. Heartbleed wasn’t in the list, however, so I stand by my guess (above) that perhaps something related to indirection caused this defect to not be ranked highly enough to be reported.
- HN has some discussion of PolarSSL and of this blog post. Also Reddit.