Trust Boundaries in Software Systems

One of the big things that has changed in computer science education over the last 20 years is that it is now mandatory to prepare students for writing software that lives in a hostile environment. This content can’t be limited to a computer security course, it has to be spread throughout the curriculum. My experience, based on talking to people, looking through textbooks, and looking at lecture material on the web, is that overall we’re not yet doing a great job at this.

When teaching this subject, I’ve started using trust boundaries as an organizing principle. A trust boundary exists any time we (the system designers or system owners) trust code, data, or human actors on one side of an interface more than we trust the other side of the interface. Students need to be able to recognize, understand, fortify, and stress-test the trust boundaries in any system they have a stake in.

Trust boundaries aren’t hard to find: We just need to ask questions like “What are the consequences if this code/data became horribly malicious? Is that likely? Can we defend against it? Do we want to defend against it?” It is easy to conclude, for example, that a demonic garbage collector or OS kernel might not be something that we wish to defend against, but that we had better fortify our systems against toxic PNG files that we load from random web sites.

Some basic observations about trust boundaries:

  1. They’re everywhere, even inside code written by a single person. Anytime I put an assertion into my code, it’s a tacit acknowledgment that I don’t have complete trust that the property being asserted actually holds.
  2. The seriousness of trust boundaries varies greatly, from mild mistrust within a software library all the way to major safety issues where a power plant connects to the internet.
  3. They change over time: a lot of our security woes stem from trust boundaries becoming more serious than they had been in the past. Email was not designed for security. The NSA wasn’t ready for Snowden. Embedded control systems weren’t intended to be networked. Libraries for decoding images, movies, and other compressed file formats that were developed in the 90s were not ready for the kinds of creative exploits that they faced later on.
  4. If you fail to recognize and properly fortify an important trust boundary, it is very likely that someone else will recognize it and then exploit it.

To deal with trust boundaries, we have all the usual techniques and organizing principles: input sanitization, defense in depth, sandboxing, secure authentication, least privilege, etc. The issue that I’m trying to respond to with this post is that, in my experience, it doesn’t really work to hand students these tools without some sort of framework they can use to help figure out where and when to deploy the different defenses. I’d be interested to hear how other CS instructors are dealing with these issues.

15 responses to “Trust Boundaries in Software Systems”

  1. > Anytime I put an assertion into my code, it’s a tacit acknowledgment that I don’t have complete trust that the property being asserted actually holds.

    That is not what assert() means to me. When I write assert(), it means that I, the writer, do absolutely believe that the condition is always true and that you, the reader, ought to believe it as well. If I have doubts about the truth of the condition, I’ll write something like:

    if( NEVER(condition) ){ … remedial action… }

    The NEVER() macro is so defined as to be a pass-through for release builds, but calls abort() for debug builds if the condition is true.

    An assert() is an executable comment. The fact that it is executable gives it more weight because it is self-verifying. When I see a comment like “/* param is never NULL */” I am much less likely to believe that comment than when I see “assert( param!=0 )”. My level of trust in the assert() depends on how well-tested is the software, but is almost always greater than my level of trust in the comment.

    Used in this way, assert() is a very powerful documentation mechanism that helps to keep complex software maintainable over the long term.

  2. That is not how I view the use of assert(). Basically, if the condition is not true, stop before irreparable damage occurs or the system cannot recover in any meaningful way.

    The idea of an ‘executable comment’ doesn’t make sense to me. Any code, not used for the direct purpose of the system, is just another failure point.

  3. Hi Richard, if the assertion is guaranteed to be true, I guess I don’t understand why you would bother making it executable, instead of just leaving a comment? In many cases an English comment would be a lot more readable.

    I agree more with Pat: an assertion is not only documentation but also part of a defense-in-depth strategy against things going wrong, whether it is a logic error, some sort of unrepeatable state corruption, a compiler bug, or whatever.

    Of course I agree that an assertion is never used for detecting things that can legitimately go wrong in sane executions of the program.

  4. assert is not defense in depth because it is compiled out of release builds which will encounter malicious input. It is a way to write pre- and post-conditions or even contracts that are used to debug software to catch/pinpoint/isolate errors as soon as possible instead of letting bad data affect state somewhere down the line and then not know how we got the incorrect/invalid state in the first place.

  5. > Any code, not used for the direct purpose of the system, is just another failure point.

    Agreed and for this reason I use the (non-default) behavior of disabling assert() for release builds. (Mostly. In applications where assert()s are less frequent, where the application is less well-tested, and where the assert()s do not impose a performance penalty I will sometimes be lazy and leave them in releases.) You obviously cannot achieve 100% MC/DC if you have assert() enabled in your code.

    An assert() is a statement of an invariant. The more invariants you know about a block of code or subroutine, the better you are able to reason about that block or subroutine. In a large and complex system, it is impractical to keep the state of the entire system in mind at all times. You are much more productive to break the system down into manageable pieces – pieces small enough to construct informal proofs of correctness Assert() helps with this by constraining the internal interfaces.

    Assert() is also useful for constraining internal interfaces such that future changes do not cause subtle and difficult-to-detect errors and/or vulnerabilities.

    You can also state invariants in comments, and for clarity you probably should. But I typically do not trust comments that are not backed up by assert()s as the comments can be and often are incorrect. If I see an assert() then I know both the programmer’s intent and also that the intent was fulfilled (assuming the code is well-tested).

    I was starting to write out some examples to illustrate my use of assert(), which I previously thought to be the commonly-held view. But it seems like the comment section for EIA is not the right forum for such details. I think I need to write up a separate post on my own site. I have a note to do that, and will add a follow-up here if and when I achieve that goal.

    That some of the best practitioners in the field do not view assert() as I do is rather alarming to me. I have previously given no mind to golang since, while initially reviewing the documentation, I came across the statement that they explicitly disallow assert() in that language. “What kind of crazy, misguided nonsense is this…” I thought, and read no further, dismissing golang as unsuitable for serious development work. But if you look at assert() as a safety-rope and not a statement of eternal truth, then I can kind of see the golang developers’ point of view.

    So I have the action to try to write up my view of assert() (which I unabashedly claim to be the “correct” view :-)) complete with lots of examples, and provide a link as a follow-up.

    Surely we are all in agreement that assert() should never be used to validate an external input.

  6. > Anytime I put an assertion into my code, it’s a tacit acknowledgment that I don’t have complete trust that the property being asserted actually holds.

    I wouldn’t phrase it this way, because it looks like an invitation to write less assert() in an attempt to fake the certainty of a condition.

    In my view, a condition is what’s required for next piece of code to work properly. If it is expected to be always true, that’s exactly the right moment to write an assert(). Otherwise, this is the domain of error trapping and management.

    The assert is one of the best “self-documented piece of code” I can think of. It can, and should, reduce the need for some code comments.
    A real code comment can come on top of that, to give context, on why this condition is supposed to be true, and why it is required, when that’s useful.
    But in many case, it’s not even necessary.
    assert(ptr!=NULL); is often clear enough.

    assert() become vital as software size and age grows, with new generation of programmers giving their share to the code base.
    An assert might be considered a “superfluous statement of truth” for a single programmer working on its own small code base. After all, it might be completely obvious that this ptr is necessarily !=NULL.

    But in a different space and time, another programmer (which might be the same one, just a few years older) will come and modify _another_ part of the code, breaking this condition inadvertently. Without an assert, this can degenerate into a monstrous debug scenario, as the condition might trigger non-trivial side effects, sometimes invisible long after being triggered.
    The assert() will catch this change of condition much sooner, and help solve the situation much quicker, helping velocity.

    Which means, I feel on Richard’s side on assert() usage. I would usually just keep it for me and not add to the noise, but somehow, I felt the urge to state it. I guess I believe it’s important.

  7. I agree with the defense-in-depth view of asserts. For this reason they are not compiled out in my release builds and I have a test to ensure that this stays true. Compiling them out would mean test/debug and release builds having different semantics, which I don’t find acceptable. You can imagine what I think about the semantics of Python’s assert!

  8. From an industry perspective on “how other CS instructors are dealing with these issues” (of understanding when and how to deploy particular defensive measures), internal training courses I co-developed try to situate detailed coding advice in a broader security engineering context. It includes threat modeling as a general practice, with a specific set of advice for our lines of business. We also frame security efforts as part of normal engineering work, so subject to the same kinds of tradeoffs practitioners make all the time, and approachable using the same families of tools (automated testing, code review, automated deployment and rollback, usability testing, defect tracking, etc.) that are already established for regular work. Of course there are things which are more specialized, and the adversarial context is important for the way we think about security problems, but this is setting the tone.

    In structuring the material in this way, we hope to make it more generally valuable than a more rigid list of recommendations. The additional engineering context is meant to make the advice more applicable to novel situations, and to prompt people to think about security engineering as something where they have a voice – not just receiving directives.

    Specifically to defensive measures, the material includes prompts for people to think more like an attacker. This starts with the threat modeling, where people are asked to go through the systems they work with at an architectural level, figure out avenues of attack, assess the consequences of attacks, and consider various countermeasures. These sessions often bring out findings relating to trust boundaries in the way you describe. At a more advanced or specialized level we have course offerings which involve hands-on penetration testing, which is a really useful way to foster an understanding of the kinds of attacks which are possible, as well as exploring non-obvious parts of the attack surface.

    Therefore, at the code level, some of the discussion above about assertions ends up being conditioned (haha) on how the code in question is being developed and run, and what it does. There are certainly specific practices and gotchas around assert() in particular which are good to know. But people should also be able to negotiate through the context of what the assert is trying to prevent; how likely it is; how bad it would be if it happened; what combination of language features, unit tests, integration tests, smoke tests, etc., are appropriate to use; what we expect to happen in the abort case (is something logged? is there an alert? who gets the alert and what are they meant to do? what does the normal user see while this is happening?); and so on.

  9. Richard, I look forward to your writeup! I do indeed hope that assertions represent eternal truths, but I also have seen that hope dashed too many times by factors beyond my (previous) understanding.

    Yann, I suspect we largely agree (as I think Richard and I mostly do). The distinctions here are subtle ones.

    Alex, thank you for returning us to the topic that I had hoped to discuss :). The internal courses you describe sound really interesting, I would love to learn more. The teaching scenario you describe, where we strongly encourage people to think both like an attacker and a defender, is a really nice way of putting it. That is what I also try to do.

  10. I’m teaching a Security (which, this being me, means some Anderson readings + lotsa static and dynamic analysis for “good” and “evil”) class for the first time. I think an important distinction to get across is whether mistrust is due to:

    – 1. error only
    – 2. an adversary

    In the first case, probability may help you out; in the second, it does not matter, an adversary drives P(trust violated) to 1. Mistaking the relatively rare case 1 for case 2 is a concept to drive home.

  11. I learned about “assert” as a beginning programmer from Kernighan and Plauger’s books “The Elements of Programming Style” and “Software Tools” (not sure which, or both) back in the day. Obviously nobody is infallible including them, but they advocated asserts as “defense in depth” and leaving them enabled in release builds, using the comparison that disabling them in release builds was like wearing a parachute on the ground but taking it off once your plane was in the air. They said their asserts tripped many times during development and kept them from going too far wrong.

    That said, I don’t understand how C has survived into the present day, when we could have so much more sanity and static safety. I haven’t tried Rust yet but I’ve played with Ada a little, and it seems preferable to C in a lot of ways.

  12. > If you fail to recognize and properly fortify an
    > important trust boundary, it is very likely that
    > someone else will recognize it and then exploit it.

    I think I disagree with “very,” and my uncertainty is part of the problem. In my extremely limited pre-Snowden experience using static analysis for security concerns, a glaring gap was the one between a bug and an externally-exploitable vulnerability. We didn’t have a good way to rank the bugs, and Snowden’s leaks suggest that we needed to worry about the many not-very-likely ones as well as the few very-likely exploitables. (I’m taking it for granted that “Fix All The Bugs” is a slogan rather than a plan.)

  13. Perhaps one of the most important trust boundaries is between code and data.

    We used to think that commingling code and data was good. Early computers (I’m thinking of the IBM 701) had no index instructions and so array operations had to be accomplished using self-altering code, and everybody thought that was super-cool because it reduced the number of vacuum tubes. In the 70s and 80s everybody was raging about how great Lisp was, because it made no distinction between code and data. Javascript has eval() because less than 20 years ago everybody thought that was a great idea, though now we know better and hence disable eval() using CSP.

    I spend a lot of time doing SQL. An SQL statement is code – it is a miniature program that gets “compiled” by the query planner and then run to generate the answer. But many SQL statements are constructed at run-time using application data. This commingling of code and data often results in SQL injection attacks, which are still one of the leading exploits on modern systems.

    The mixing of code and data is a exceedingly powerful idea. It forms the essential core of Godel’s incompleteness theorem, to name but one prominent example. It is a seductive and elegant idea that can easily lure in unwary. So perhaps the trust boundary between code and data should be given special emphasis when talking about security?

  14. Efficiency and safety could both be improved by having a means of telling the compiler that execution must be trapped and aborted if a condition doesn’t hold when this point is reached, but execution may at the compiler’s convenience be aborted at almost any time the compiler can determine that it will inevitably reach some kind of an abort-and-trap. Granting such freedom to a compiler could greatly reduce the run-time cost associated with such assertions by not only allowing them to be hoisted out of loops, but also by allowing a compiler that has generated code to trap if an assertion is violated to then exploit the fact that the assertion holds within succeeding code.

    For example, given:

    for (int i=0; i<size; i++)
    __EAGER_ASSERT(i = b && (a-b) < 1000000);
    return (a-b)*123 + c;

    a compiler could at its leisure either perform all of the computations using 64-bit values or trap if the parameters are out of range and then perform the multiply using 32-bit values.

    Precise trapping for error conditions is expensive, but allowing traps to be imprecise can make them much cheaper. A language which automatically traps dangerous conditions imprecisely may be able to generate more efficient code than one which requires that all dangerous conditions be checked manually.