Memory safety errors in C/C++ code include null pointer dereferences, out-of-bounds array accesses, use after free, double free, etc. Taken together, these bugs are extremely costly since they are at the root of a substantial fraction of all security vulnerabilities. At the same time, lots of nice research has been done on memory safety solutions for C, some of which appear to not be incredibly expensive in terms of runtime overhead. For example, baggy bounds checking was shown to reduce Apache throughput by 8% and to increase SPECINT runtime by 60% — both effects are probably of a similar magnitude to compiling at -O0. Here I’m talking about strong versions of memory safety; weaker guarantees from sandboxing can be cheaper.
Although we commonly use memory safety tools for debugging and testing, these are basically never used in production despite the fact that they would eliminate this broad class of security errors. Today’s topic is: Why not? Here are some guesses:
- The aggregate cost of all vulnerabilities due to memory unsafety is less than the cost of a 60% (or whatever) slowdown. Although this is clearly the case sometimes, I don’t buy it always. For example, wouldn’t it be nice to have Acrobat Reader be guaranteed to be free of all vulnerabilities due to memory safety violations, forever, even if it slows down a little? I would take that tradeoff.
- Memory safety solutions come with hidden costs not advertised in the papers such as toolchain headaches, instability, false positives, and higher overheads on real workloads. My guess is that these costs are real, but that people may also exaggerate them because we don’t like toolchain changes.
- People are not very good at estimating the aggregate costs that will be incurred by not-yet-discovered vulnerabilities. My guess is that this effect is also real.
Anyway, I’m interested by this kind of apparent paradox where a technical problem seems solvable but the solutions aren’t used. There are lots of other examples, such as integer overflows. This seems related to Worse is Better.
26 responses to “Why Isn’t C Memory Safe Yet?”
A thought experiment: one day the GCC cabal decides to change the default setting from not-safe to safe, and this will ship in the next release.
What happens? How much of the GCC-verse adds the reverse switch to its build scripts? How many depart for LLVM?
If less than say 20% switch (in either sense), why is the GCC cabal not changing the default now, and what could get them to do so?
If GCC reverses course, what pressure caused them to do so?
My own best guess is that
1. less than 20% would switch,
2. the GCC cabal doesn’t want to do it because they would suffer under the slings & arrows of public outrage while personally gaining very little from the global increase in security,
3. and there would be significant pressure from GCC sponsors and programmers’ employers from companies whose hacks start to fail, have to buy additional computing resources to pay for the safety, and likewise gain little from increased security.
It’s too bad that there’s no benevolent dictator of a major C compiler who might actually throw the switch and prove what would happen.
How about: You “don’t need” that kind of tool for small programs and retrofitting a large legacy code base (not the build system/tool chain, just the code that’s being shipped) to work under them is “to much work”.
This BTW leads to an adoption path: 1) have a “differential” mode that can, with a very high accuracy, mute any errors/warning that were seen in the previous version. And 2) a heuristic for identifying which of the muted items are the most valuable to attack (either because they are the most likely to be real bugs or because they are blocking analyse of large sections of code).
Clearly to make this work would require careful attention to graceful degradation in the analysis package.
Hi gwern, I like that analysis, but I think that most C doesn’t even want to be memory safe. For example GCC itself would gain relatively little (even though I routinely find unsafety bugs in it). Rather, we need to focus on the codes that are proven to be exploitable bug havens — such as certain Adobe products — but aren’t particularly performance critical.
bcs I think something like that could work, and the reason we haven’t done it yet is because people are endlessly optimistic that they’ve found the last bug!
bcs, how much effort do you think it would take to re-engineer a large code base for memory safe execution? I was assuming the effort would be low since any problem that’s not a real bug can be blamed on the safety tool.
Have you talked to the Microsoft people who work on Prefix/Prefast? When I worked there, we spent a significant amount of time addressing _every_ memory safety error reported by either tool. Often, it required not fixing explicit bugs, but just changing to different APIs or control-flow structure that the tools were capable of handling. It was shockingly (to me, at the time) effective, as once we rolled out the “prefix clean to ship” policy, it basically ended that sort of memory safety bug.
Except when loading third-party code. Those folks do the nastiest stuff you can image to your internal data structures, and there’s not much you can do to protect yourself when loading them in-process.
My context was Visual Studio; other organizations’ experiences may vary.
Hi Lars, if I were MS I’d try very hard to run all third-party code in sandboxes (especially drivers!).
Thanks for sharing the prefix/prefast experience, that’s really interesting. You definitely need buyin from developers to get this all to work. I saw John Pincus give a prefix talk in 1998 or 1999 and it was really mind-blowing.
Bounds checking overhead depends heavily on how you use arrays. If you dereference only once and use local variable from then on, it should be cheap. If you dereference like crazy, penalty will be severe. It’s a matter of smarter compilers to bring the overhead down, I guess.
Re re-engineering effort: I think the major cost would be from false positives. I suspect there are way more (10x or more?) “technically incorrect” code in the wild than “actual bugs”. At that rate you either quit trusting the tool at all (i.e. quit using it) or quit trusting yourself at all (and just fix everything).
OTOH, the code may work in all (currently) realizable scenarios. This gives another option of debating the point with the tool via hints and whatnot. But at that point, it may be to simpler (and less brittle) to restructure the code than to coach the tools through noticing that.
Long term, blaming the tool doesn’t work. As far as you can viably go is to tell it to shut-up about things “for now”.
I live in the embedded world. Double-digit overhead is a showstopper here.
Hi Alex, I hear you about embedded, but on the other hand consider how many smart phone apps could stand a double-digit overhead — probably most of the ones that aren’t compute-heavy games.
When we ameliorate languages having fundamental and basically unfixable problems by devising ever-cleverer (but still unsound) bug-catching tools, are we not just diverting resources away from long-term solutions and prolonging our own suffering? With Industry (the blokes paying me) constantly clamouring for expediencies, we are always obliging — it pays well, and makes people happy; what more can a programmer ask for? But deep down, we know that we are doing them and ourselves a disservice. C does not need memory safety, it needs a gun.
In terms of gwern’s analysis, the thing I find especially compelling is to compare the GCC cabal throwing the memory-safety switch vs throwing the (un)signed-pointer switch. I remember when they did the latter. There was a whole lot of whinging and hand-wringing about it, and in the (very) short term many folks stuck with the old version of GCC for their legacy projects. But, in rather short order, developers cleaned up their code and started using the new GCC.
So, why was the GCC cabal willing to throw the switch on (un)signed pointer types in spite of the backlash they knew was coming?
Is the overhead of fixing memory safety really that much different than the overhead of introducing pointer signedness? I mean in terms of programmer time. There will always be complaints about performance; though, given the prevalence of Java, clearly these cannot be showstoppers for the majority of developers (though they will be for certain developers; e.g., on embedded systems).
If the overhead of fixing memory safety is really so much higher than that for introducing pointer signedness, then what sort of partial safety can we introduce which has a comparable overhead?
how many smart phone apps could stand a double-digit overhead — probably most of the ones that aren’t compute-heavy games.
Aren’t most android app, including games, written in Java, not C?
I don’t expect C to ever become memory safe unless it becomes something I would not feel comfortable calling “C”, and I would really hate it if people did start pretending it is “C”.
I’ve long been interested in dialects of C which attempt to make C safer in one way or another, and of course particularly if they are still applicable as systems (including embedded systems) programming languages.
My interest in fact goes all the way back to reading about the Tunis system implemented in Concurrent Euclid as as provably correct operating system. (ISBN 0-201-10694-9)
The latest of the C dialects to catch my eye is: http://www.xmos.com/technology/xc
Unfortunately it is currently tied to one hardware target.
Cyclone is, I think, the more widely known “safe” dialect of C, but I don’t see much about it outside of some rather limited circles and it doesn’t seem to be growing well enough.
Mattias sounds like his life would be happier in the coddled worlds of Java or C# 😉
The C and C++ standards go into enough detail about what behaviour is undefined that I think compilers should have a mode where they tell you if you’re doing it. That would be great for development and testing, making secure(r) versions of critical apps, and you could even have evil companies like Adobe offering separate versions of their crapware (not that they ever would).
John I remember a post of yours which showed there could be seven or eight possible undefined behaviours in just a couple of lines of C code. So adding a “checked” mode to C and C++ would be a big job for sure, but I would think it would be pretty straightforward and not require the type of ju-ju that goes into optimising.
I would also like to see ‘assert’ a reserved word in C/C++, meaning that it traps in a debug build, but in an optimised build it’s a hint to the compiler that that condition will never happen.
“The C and C++ standards go into enough detail about what behaviour is undefined that I think compilers should have a mode where they tell you if you’re doing it.”
That would defeat the purpose of undefined behavior, which is giving the compiler leeway to do what it wants *without knowing that it’s doing it*. You can write an optimization that works and not have to worry about what happens in the corner cases because “those are undefined behavior so anything goes”. Having to write the optimization in such a way that all corner cases are explicitly flagged as undefined behavior makes the existence of undefined behavior itself uncompelling. Not just that, some cases of UB are hideously expensive to detect, impossible to slipstream into optimizations. It’s easy to say “this is UB so the compiler need not worry”, far less so to say “this is UB and the compiler should detect that”. Otherwise there’d be no market for verification tools. 🙂
There’s a real case to make that undefined behavior is uncompelling in and of itself — that we’ve made enough progress in the field of optimization not to need escape hatches like this, and that it’s even counterproductive because it leads to pointless squabbles between developers and compiler writers, not to mention complex compiler bugs — but that’s another argument altogether.
C is a victim of the very things that built its success — once you’ve ransomed your soul to the devil, good luck getting it back.
Hello, John.
Just out of curiosity, what memory-safe C++ solution(s) are you thinking about? I do not follow C++ very closely. I know of plenty of memory-safe C solutions. It is an interesting question why these aren’t used more.
Hi Pascal, you’re right, most projects don’t support C++. I thought that Baggy bounds checking did (since they mention C++ several times in the paper) but I checked again and they explicitly say they don’t.
Address sanitizer supports C++, but it is not a full memory safety solution but rather more of a best-effort bug finder like Valgrind.
My guess is that supporting C++ is probably not a huge amount of work for LLVM-based tools.
Greg and Mattias, C was explicitly designed in such a way that conforming implementations could support memory safety. I can’t remember where I read this but K or R said it.
So I don’t think memory safety is at all a crazy dialect or a heroic fix. But it is hard to implement efficiently…
I always assumed that the concessions the C standard made for odd systems were mainly to get semi-useful implementations on Lisp Machines and other tagged architectures like Burroughs B5000 and AS/400, and DSPs.
That, and to allow for C interpreters, but strictly conforming (and safe) ones seem to be surprisingly difficult to write. Difficult features seem to be unions, untyped heap memory, and allowing arbitrary data to be copied bytewise by punning them with arrays of char.
Mattias I totally agree that they did not succeed in creating a language that easily supports memory safety.
But it’s interesting that they did make some efforts in that direction, well in advance of any of the implementations.
Regarding your larger point about where to invest effort, we are stuck with a large amount of C for some number of years, so I think it’s fine to work on ways to make it work better. Of course we should also work on languages with less foot-shooting potential.
” Jeroen Mostert | May 23, 2012 at 2:30 am | Permalink ”
Jeroen, I’m completely in favour of C/C++’s undefined behaviour. I said I would like a mode where undefined behaviour is detected, which would be a test/debug or safe/checked build where optimisations are discarded if they aren’t compatible. Then a programmer could release an optimised build after satisfying herself that nothing undefined happens, or a web site more concerned with safety against hostile input than performance could run a safely-compiled public server.
I’m interested in undefined behaviour that is intrinsically hard to detect, even with no optimisation. What example are there of those (not involving invalid pointers into the heap)?
Magnus, one instance of undefined behaviour is the use of unassigned variables. Consider an automatic array that is read and written by a complex algorithm. A compiler cannot in general decide whether a read from a specific index has been preceded by a write to the same index.
That is one reason why the usual warning about uninitialised variables typically only works for scalars, at least soundly.
Another case of “difficult” undefined behaviour concerns multiple side-effects between sequence points: “x = a[i]++ + a[j]++” is invalid only when i=j.
Neither of these problems have anything to do with pointers into the heap.
I agree with your intuition about a 60% slowdown being often a fine tradeoff for memory safety. However, for any software component where the developers feel that way, wouldn’t they simply use Java or C#? That way, they get a language designed for memory safety, instead of one where memory safety is tacked on afterwards.
Lex, there’s a lot of C code where C used to be a good idea because machines weren’t as fast and didn’t have as much RAM, but that doesn’t need to be in C any longer. If we had resources to rewrite it all in C# or Java that would be great, but we don’t.
There’s also a lot of C code out there that will engage in “undefined” C behavior that is nevertheless meaningful at the machine level. Integer overflows are the most common example, but use after free is also fairly common. The behavior may be technically undefined, but applications will often use it if it (fairly) reliably works on a user’s platform.
My knowledge may be a bit dated, but from a security perspective I think ASLR on 64bit + no execute is enough of a barrier to thwart most exploits. Some of the sandboxing techniques could provide an additional layer of security, and I would guess that the Adobe security folks are aware of at least some of the additional tools you mention. It’d be interesting to know what their adoption barriers are.
I remember reading in some of the DynamoRio materials that Adobe products will often use dynamic code generation. Static instrumentation just won’t help there.