After 10 short years as a university-level CS instructor, I’ve finally figured out the course I was born to teach. It’s called “Writing Solid Code” and covers the following topics:
- Testing—There are lots of books on software testing but few that emphasize the thing I need students to learn, which is simply how to break code with minimum effort. There’s no book at all that covers fuzzing in a useful way (though Alex plans to write it). Lessons Learned in Software Testing is one of the better books I know of.
- Debugging—I covered my view of debugging here a while ago and also plan to borrow material from nice books by Zeller, Agans, Butcher, and Metzger.
- Assertions—I try to not be too dogmatic about stuff but CODE HAS TO HAVE ASSERTIONS. That is all there is to it. I’m in the middle of a long post about this and hope to publish it sometime in November.
- Medium-weight formal methods—Probably using Frama-C to statically verify that assertions hold and language rules (array bounds etc.) are not violated on any path through the code.
- Code reviews—I’ve been reading Wiegers but realistically I think the main thing is just to do a lot of code reviews.
Ideally this course would not be needed. Rather, the material would be implicit in the contents of lots of other courses and students would naturally acquire strong skills in defensive programming, testing, debugging, and code reviews. We do not appear to be doing a very good job of encouraging these skills to develop. Rather, we’re giving the students a four-year sequence of little fire-and-forget programming assignments with zero focus on code quality and where the code is thrown away after the due date. I believe this is doing them a disservice.
In my embedded software class this fall I’ve been trying a different approach where the students are working in groups and testing their own, and each others’, code in multiple rounds, and we’re doing some lightweight code reviews in class. This has sucked up a lot of time but I’m hoping that it ends up being worthwhile as a way to force students to keep revisiting and revising their code.
I’d appreciate pointers, suggestions, etc.
26 responses to “Writing Solid Code”
Time in a semester does not typically allow for a student to see the importance or impact of these things. These would be the topics if I looked at where my course experience fell short in preparing me for what I do now.
I think the trick to getting meaningful exposure on these topics would be to start from somebody else’s code. Most college projects have to focus on the basic functionality, but the real-world fun comes after that.
Maybe have a buggy (but still functional) implementation or two for students to build on. It will be a frustrating and painful experience for them, but more like what they will see in the wild.
Hilariously, one of the classic books on this is… Writing Solid Code! But, it shows its age, and I’d recommend the Code Complete book over it. The big difference between those books (and others targeted at practicing programmers) and your list is more of a focus on idioms and patterns for bulletproofing code. Lots of little stuff (micro-patterns, like error checking styles) that seem annoying in the small, but really matter in the team-sized codebases at MSFT, which were usually a couple millions of lines of code per 10ish devs.
I also see a lack of tools familiarity in the academic space. For example, memory breakpoints are supported even in GDB at this point, and make large classes of memory corruption problems trivial to track down (once you turn off stack / heap randomization) but most folks don’t know how to use them.
Hi Lars, I wasn’t trying to be hilarious, I like the Maguire book too!
Definitely agree about tool unfamiliarity. About half of my embedded software class this fall had never used Valgrind, and consequently the first version of the code I got from them was nearly uniformly riddled with memory safety errors, use without initialization, etc.
Regarding the idioms and patterns, I agree but don’t really feel qualified to teach that stuff. Also, I think this stuff is probably more specific to particular domains and companies than the stuff I hope to teach.
You should add being a good open source citizen to the list!
Crazy thought: can instrumenting (e.g. line/path coverage) a program that is being exercised as part of an automatic test case reduction reveal where the bug is by way of some kind of supervised learning system?
When it comes to code reviews I also believe that doing them a lot is extremely useful. One thing I particularly like about combining git with Gerrit (http://code.google.com/p/gerrit/) is that it makes it possible to insist on meaningful reviews without holding a team up too much. I strongly suspect that a good DVCS facilitates a focus on code quality by looking at a project as a whole rather than as a set of files; which can facilitate refactoring bad code across overly coupled modules.
I agree with Nathan Cooprider’s comment, I don’t know that it would be the focus of the class; but refactoring “working” but buggy code is all too common an experience outside the classroom.
Hi Justin and Nathan, something that I’ve always wanted to do, but haven’t quite had the gumption or whatever, is to force teams to exchange code with each other mid-way through the semester. I think it would be a great experience for people, though I’m not sure how to work the grading, since the variance in quality of code people receive will be quite high.
Hi bcs, I’m kind of working on something like that right now! I want to do a blog post about it but the paper will soon be under anonymous submission and while a blog post would technically be allowed under the rules I feel like this might be too much thumbing my nose at the review process.
Edward, what constitutes being a good open source citizen? I can probably guess but am curious what your answer is.
I like the exchange idea. Maybe you could make a portion of their grade based on how well they performed the assignment themselves, and another portion how well subsequent teams could maintain their implementation. Round-robin style, at the very least the pain of a bad team would be evenly distributed, and presumably you could identify bad performers objectively. Either way, having to effectively deal with shitty legacy code (either your own or others’) is 90% of the gig.
And I forget exactly who to attribute this to, but I increasingly see writing solid code as a social problem. How do you effectively collaborate with other people who don’t live inside your head, not the least of which is future you, who doesn’t remember what the hell you was doing in this module last year.
I also really don’t like the whole formal methods angle. Juice-to-squeeze wise, it’s not cost-effective for most projects in the wild. The gross majority of the problems I run into in terms of ‘solid code’ are organizational on a really, really basic level. Or they’re because I didn’t read the manual well enough.
Just basic, stupid, manual testing is the biggest solver of pre-release problems we’ve found. Unit testing is ok as a ‘did I zip my fly’ metric, but you’re not gonna find anything but the most basic errors that way. Systematic, thorough testing, when done correctly, is tedious but effective. The only thing that’s better for finding bugs is collecting live crash reports. Live crash reports are disgustingly effective, if you can get away with letting a few bugs manifest in the field.
If you do the “exchange code” thing, also change the requirements at the same time. Try to make it invalidate a (formerly) valid assumption that is baked into the existing code. Also, after the first run (when everyone knows you are going to do it) make the teams review each others code and point out “review it like you will have to maintain it, because you will”.
As for grading, “svn blame” or the equivalent? Ask the students to grade (once when they get it and once at the end of the class) the code they are given? You could also set up a continuous black box testing system and watch how the code progresses over time. I’d make the test cases partly or whole invisible to the to the students. If you include the ability to add new tests and back-fill results, you might find interesting results from that.
You can solve the grading problem by being even more radical: just give everyone the same shoddy, undocumented piece of s**t at the start, and mark everyone on how much they improve it. Don’t make them add any new functionality at all: after all, most courses give one a very unbalanced idea of how much time one actually spends writing new code from scratch, so it would be nice to have one which corrected the impression.
Oh! Cool idea Alex. Give them:
– a bunch of test cases
– a pile of #$%#$ library that passes them but will fail on just about anything else
– incomplete, inconstant and wrong documentation and specifications.
– several applications that use the library (only some of which they are allowed to alter).
– more bug reports than the whole class can address in the time allotted.
Grade them on how many bugs they fix. Include performance issues, feature requests, usability issues and even a few can-not-reproduce and works-as-intended issues. Just to be evil, include a bug where the code is clearly wrong but fixing it introduces a bug in one of the apps (one that can’t be altered) that is easy to spot by inspection but not covered by any tests.
Setup time: 15 man-years.
The first rule of writing solid code is “don’t write large applications in C”. How about using Rust?
* The compiler catches a huge variety of bugs that in C code would lead to crashes.
* The language lets you isolate unsafe code (code that might crash or leak), making it clear where to concentrate your testing or debugging effort.
* Experience with algebraic data types, higher-order functions, purity, and immutable data will help students write better code even in other languages.
* Students will have an opportunity to contribute to a rapidly growing set of libraries, giving them realistic experience interacting with other students and non-students. They could even contribute to the Rust compiler or Servo browser engine.
* The macro and compiler-extension systems should be powerful enough to implement a QuickCheck library. Then any library can be fuzzed just by (1) implementing the Arbitrary trait for its data types and (2) exporting “properties”, test functions that have arguments.
Not sure if you cover it under “debugging” but I would add:
– Don’t guess what’s going on, check it!
– Don’t be afraid to look at the compiler’s output, or the debugger’s disassembly
A little familiarity with assembler can go a long way. Check out “Forensic Debugging” by Elan Ruskin. It’s somewhat Windows/Microsoft oriented, but a lot of it is not platform-specific.
You might also have a look at, and contact the instructors of, the CS70 course at Harvey Mudd College (my undergrad). The course is titled ‘Data Structures and Software Development’. It comes early enough in the curriculum that software written for most later courses is influenced pretty strongly by it. It includes explicit instruction in writing high-quality defensible code, and also explicit grading on those characteristics (along with the usual ‘does it work’).
Does safety-critical (certified or whatitscalled) code really have assertions? I was under the impression that dead code in general is disallowed or at least strongly discouraged under such circumstances. How else to get 100 % MC/DC coverage, and so on.
I agree with everything you proposed by the way (in particular the code review and formal methods parts), but would like to stress the value and importance of modern static type systems and using them to maximum advantage.
@Mattias: if I were writing a coding standard for safety-critical code, I’d require aggressive assertions and the inclusion of a compiler pass that verifies that they are in fact dead code. Do any current compiler have the annotations to do that?
“Maybe have a buggy (but still functional) implementation or two for students to build on. It will be a frustrating and painful experience for them, but more like what they will see in the wild.”
Yeah, I start students out with a semi-pseudo working implementation, full of bugs, cobbled together Frankenstein-style from old assignment submissions. Then they have a group, where they don’t build code together, but test each others’ code. The ambitious students tend to scan and test the entire repository by end of class, of course. It mostly works, but has problems too — grading is very labor intensive and more subjective than I’d like the way I do it, based on project report + my tests of code, etc.
In your embedded systems class, are you encouraging the use of lightweight or mediumweight formal methods? Would that be feasible in an embedded systems course?
Oh, yeah, my undergrad OS class did something along the lines of providing potentially-buggy code to work with. We started from a bare-bones version of OS/161. Each assignment involved adding a set of system calls that provided some particular portion of functionality. Our submissions were patches against the existing codebase. After each submission, each programming pair reviewed and tested all of the anonymized submissions, including the professor’s. The one rated most highly in aggregate was integrated as the baseline for the next assignment. If code you introduced was later found to be broken, you were responsible for fixing it through the rest of the term.
So, in addition to the principles of operating systems, we covered reviewing and testing practices, and bug diagnosis and correction with other stuff depending on it.
Will, I agree across the board. And it is the basic, stupid, manual testing that I often see students failing to perform. I don’t think it’s an issue of skill as much as it is one of attitude.
Hi bcs, I think you are designing the course you wish you took in school! I’ve found that in practice, introducing even one of the kinds of complications you list will be almost a showstopper. But I like it.
Jesse, I totally agree about C but on the other hand it has some great tool support. My own inclination would be to split the coding between C and Python. Rust looks very cool but I’ve only glanced at it, not even written one line.
Igorsk, I very much agree with these things and always to try help the students develop the kind of tame paranoia that dominates my attitude as a programmer.
Hi,
I thought one of the most important criteria to writing solid code is adhering to certain standards.
For example, naming variables i, or my_var does not help the person reading the code and will ultimately make the code weak.
I find it odd that you didn’t mention this.
@Fadi #23: Naming, indentation, and so on are sort of like the “five-paragraph essay” we all learned in grade school. College courses shouldn’t need to teach that stuff except in remedial classes; and *grading* based on that sort of thing is as stupid as if a college writing course explicitly based its grading on whether every paragraph began with a topic sentence.
Martinets who insist on naming all loop counters “loop_counter” instead of “i” are part of the problem, not part of the solution.
(But to give you the benefit of the doubt: people who name their variables “jeanluc_picard” instead of “input_intensity” are *also* part of the problem. It’s just that those people make up roughly 0% of the working population, whereas I’d argue that martinets make up 5% or more.)
Couldn’t agree more with the need for code reviews!
I can recommend Rietveld as a code review tool (straight from Guido van Rossum himself). We use it for the ns-3 project to review all major code contributions.
https://codereview.appspot.com/
As a TA, I graded on this — if the code formatting was terrible, then I gave the code zero marks, regardless of whether or not it worked. Maybe this is unusual, but I always read the code they wrote, in addition to running it through a test suite. Reading thousands of lines of generally terrible code is of course a horrible experience, but I know no better way to really understand the mistakes students make. So to minimize my own suffering, I demanded code I could read, so that I could grade it.
A second reason for making this an explicit grading policy is that this *wasn’t* stuff that they learned in grade school — not everyone had programmed. Even the ones who had programmed before often had little experience writing code for other people to read. So making this part of the instruction helped them by making some of the implicit assumptions of a more expert programmers explicit.
The final reason was that bad code style was was a very reliable indicator of bigger problems. If I couldn’t read their code, usually they couldn’t, either, and this meant they were trying to hold their whole program in their heads while they worked on the assignment. Even on tiny homework problems, even for the brightest students, this doesn’t leave much brainpower left over to actually think about and reflect upon the problem.