If you were given the opportunity to spend USD 100 million over five years to maximally improve the security of open source software, what would you do? Let’s just assume that the money comes with adequate administrative staff to manage awards and contracts so you can focus on technical issues. A few ideas:
- Bug bounties, skewed towards remotely exploitable bugs and towards ubiquitous infrastructure such as OpenSSL and the Linux kernel. To get rid of USD 100 M in five years, we’ll probably need to make the bounties very large by current standards, or else give out a lot of them.
- Contracts for compatible rewrites of crufty-but-important software in safe languages.
- Contracts for aggressive cleanup and refactoring of things like OpenSSL.
- Contracts for convincing demonstrations of the security of existing codes, in cases where it seems clear that rewrites are undesirable or impractical. These demonstrations might include formal verification, high-coverage test suites, and thorough code inspections.
- Research grants and seed funding towards technologies such as unikernels, static analyzers, fuzzers, homomorphic encryption, Qubes/Bromium-kinda things, etc. (just listing some of my favorites here).
- Contracts and grants for high-performance, open-source hardware platforms.
This post is motivated by the fact that there seems to be some under-investment in security-critical open source components like Bash and OpenSSL. I’ll admit that it is possible (and worrying) that USD 100 M isn’t enough to make much of a dent in our current and upcoming problems.
24 responses to “Buying Into Open Source Security”
One thing I see missing here is a discussion of attack surface identification and threat modeling. As a preliminary step, I’d look at the stacks of widely-installed applications on the client and server ends, and all of the libraries that they pull in. Figure out what bits can or can’t be exposed to hostile data, and how broadly exposed they are. Then allocate funding to components in some proportion to the scale of vulnerability an exploit would create and how much attention that component’s security already gets.
Two things, both in the AI/machine learning sector:
1) static/dynamic analysis at scale. Build tools that allow effective and (human) efficient deep inspection of code to the point where FOSS projects can require every bug found to be turned into an enforceable rule and then enforce that on all code bases.
2) develop tools able to autonomously generate existence proofs of every known better than brute-force attack on a cryptosystem. Then run them all the tools on all the used systems. Generalize. I.e. quit trusting the NSA.
I’ll grant that #2 would likely take a bit more than $100M (after all it’s trying to play catch up with the last 20+ years of the NSA’s >$1B budget, but then again, there’s no operational deployment needed).
Hm, what safe language would you suggest? Ada, Rust, Haskell, ATS, Coq, …?
You may be interested in the fact that the European Union allocated 1M€, as part of its 2015 budget, for a “pilot project” named “Governance and quality of software code – Auditing of free and open-source software”. Three tasks are explicitly listed in the budget description: study existinging auditing practices, listing free software used by european institutions, and doing an exemplar code audit, preferably of code whose flaws could disrput services or reveal private personal data.
PS: if you look in the budget ( http://eur-lex.europa.eu/budget/data/DB2/2015/en/SEC03.pdf ), it’s Item 26 03 77 02. The text is rather interesting (it is not aimed solely at software used by EU, but free software in general, and it explicitly refers to the Debian constitution and practices).
Fascinating question. I’d be in favor of a staged approach. In the first phase, allocate some money (20%, say) to bug bounties, somehow weighting the bounties based on the severity of the issues. You’d want to allocate enough money such that you could categorize the security bugs discovered via the bounties, and then allocate the rest of the funding toward static analyzers or rewrites that eliminate or dramatically mitigate classes of vulnerabilities. Maybe a contest approach could be used in the later phase(s) as well, with additional prizes for the most unhackable systems or most effective static analyses.
The E.U. initiative mentioned seems fairly pragmatic. I’m generally not a fan of bureacracy, but I think improving standards alone would go a long way toward reducing some of the major obvious risks. I’ve personally known financial institutions to run software as root on hosts, exposing a ‘telnet like’ management port. The software even executed securities transactions. Having everyone adopt MIL-SPEC probably isn’t reasonable though. Microkernels/unikernels, capabilities based security, OpenSSL/LibreSSL, etc…
Phil, I totally agree, and I suspect that prior to this fall, most such analyses would have left out bash.
Sometime shortly after shellshock came out, my smart TV installed an update and rebooted. I’ll just leave that out there…
qznc: good question! I don’t have a specific favorite safe language right now, but I do feel that the “compatible rewrite” strategy is one that should be tried out in a big way.
gasche– thanks for the reference and link! It’ll be interesting to see where that money goes.
Manu, yeah, sounds like lots of fun! Only I’m short $100M – \epsilon.
Perhaps some money to lobby for reform in the way code is tested/taught in schools. You’re the only professor I’ve had who has drilled good code hygiene, checking return values, thinking about where data is coming from, etc. When students are removing checks because “it makes test X fail and I lose points” we’re doing something _really_ wrong. Students need to to be significantly more paranoid about the code we write, only a few professors are actually talking about that. If we’re going to get serious about security it has to start in CS1000.
Scotty: I have taught you well, young padawan.
Taking off from Scotty’s comment: Fund university level security challenges.
E.g. build a sandbox that students can write and test code in and then give them specs and limited sets of test cases (i.e. 100% valid but non-covering inputs). Once they pass that and think they are ready, run it thought a demonic stress/fuzz/glass-box test suite and give them a score based on how it does (along with very limited info about the failure mode).
Just the work to develop a tool to derive crashing inputs for arbitrary code could be interesting and useful in its own right.
bcs, an advantage of what you suggest is we can do it for cheap.
This is a little off topic, but since someone else mentioned education…
I’m going to be teaching an intro to systems class soon (using the CMU CS:APP template). I’m interested in spending a couple days on a quick introduction to formal verification. (Motivation: “See how easy it is to make C programs that break? Wouldn’t it be great if we could prove that particular programs won’t?”) Does anyone know of good classroom materials for that project? I’m starting to work on some exercises that use Frama-C/ACSL/Jessie, but I’d be happy to hear if there are more appropriate tools.
Hi Ben, I’ve taught a course using CS:APP many times and it’s great material. Frama-C can work well but I think you’ll need to carefully manage the assignment and the expectations, and also give them access to a pre-installed version.
Or, alternatively, some of the tools here are really cool and using web-based tools has obvious advantages for teaching.
http://rise4fun.com/
Any static taint analysis or dynamic tracing/auditing of the application frameworks involved would have pointed to the underlying shell as one of the moving parts of the system. Whether researchers would have written it off as irrelevant is a different question, but this time around I think people will understand that you can’t write off a component as “not exposed” just because you can’t see the path to it.
That’s a very interesting question, and I have pondered my self for a long time.
My first public outburst about it was my FOSDEM keynote last year, where I asked how much money would NSA have to spend to get the opposite result.
My conclusion is “very little”. An investment on the order of 10 million USD would very efficiently make sure that software stays insecure in ways that aids NSAs total-surveillance state. (Google “Operation Orchestra” for the full talk.)
My second outburst was an article in ACM queue, about the economics of good source code: http://queue.acm.org/detail.cfm?id=2636165 where I asked the same sort of question as you do.
My current thinking — its far from a conclusion — is that the best strategy is to find the right talents, and buy them time to pay attention, because that really is the crux of software quality: Time, talent and attention.
For 100 million, you could hire 100-150 top quality developers for five years at competitive rates, that could certainly do wonders.
However, a lot of the problems are not about source code, but about architecture, and to really have an impact, the 100-150 developers should attack at that level, and at that scale of effort, they would have the fiat to do so.
(See for instance: http://queue.acm.org/detail.cfm?id=1944489)
To all of the technical suggestions above, I only have one comment: Tools won’t do it, unless people pay attention to them.
See Coveritys average defect rates for FOSS, and it becomes painfully obvious that people are not paying attention.
Poul-Henning
Thanks for the links and comments, Poul-Henning. I agree that hiring 100 top developers for 5 years would be a good start.
An excellent question, one that anyone wanting to make an impact ought to consider.
I can’t help but think about the question for prioprietary software, including military software, even though the main question is about open-source software. I expect that the majority of security problems that will matter going forward are going to be in proprietary software. Open-source software tends to be better written and better factored, and it tends to get all manner of auditing. If you want to hack a web server, it’s not Tomcat where you should focus.
For priopriety software, better platforms would help. The switch from C++ to Java is really helping with security due to buffer overflow causing a trap. Going forward, it would help to have languages with better SQL integration and better XML/HTML processing. It would help if arithmetic overflows caused a trap. It would help to have taintedness as part of the standard type checker.
It would also help proprietary software to have better techniques for inspection, both manual and–if you’ll forgive my bias–automated.
Both of these areas seem amenable to public funding. As well, both of these areas would *also* help open-source software.
Lex, I think the main answer for proprietary software is to ensure that the incentives are setup correctly. For example, if I’m under contract to provide a system for you, there should be substantial built-in penalties if I deliver you something insecure. Of course it’s going to be hard to make this work for vulnerabilities discovered years later, but if your red team finds something in the delivered system, that should be a big problem for me.
Incentives are harder to manage for open-source software, which is why I think some public investment is probably warranted.
I would also consider paying for free trainings and advocacy. Training can make sure that security is in the mindset of developers and advocacy can help secure more funding.
In the real world, it seems that there are plenty of places where e.g. running a fully automated fuzzer like AFL could help developers find bugs, but developers don’t know about them and/or don’t care. As engineers, I think we all tend to think to fix an issue (in this case, security) through technical means: static analyzers, bug-free software, new software languages, etc. 4 out of 5 of your solutions were purely technical. I think real-world security is much more complicated and needs a more diverse approach.
Just my 2 cents 🙂
Following Poul-Henning’s comment about *using* tools:
Build the infrastructure needs to make it trivially easy to gate submitting to the “master repository” for open source projects on clean(er) results from a tools.
That tooling should be an O(n+m) problem not an O(n*m) problem. To really get full use of it, it needs to be something that can be turned on before the code is remotely clean. This would for example allow GNU to start a blanket policy of “every submission must be at least as good as the last”.