Writing Solid Code Week 2


First, I wanted to thank everyone for the great discussion on the post about week 1. My single favorite thing about blogging is the discussion and involvement that these posts sometimes generate.

During week 2 we worked on issues relating to the triangle classifier. As several commenters predicted, getting correct results using floating point code was not easy. In fact, as of last Thurs, the only FP implementation that passed all test cases was one supplied by reader David Eisenstat. On Tuesday we went through the code of a randomly selected student; here it is. Josh’s code is — we believe — correct. On Thursday we went through David Eisenstat’s integer-based code, which was useful for several reasons. First, sometimes it’s easier to review code written by someone who isn’t actually present. Second, David’s code uses a couple of little C tricks — stringification and use of “static” in an array parameter — that I wanted to show the class. Third, except for a couple of little warts (I’d have preferred library calls for sorting an array and for converting a string to an int) his code is strong. We also went through a couple of people’s test scripts. Most students in the class used Python for test automation; a minority used Bash, one person used Ruby, and one student embedded his triangle code in a cool iOS app that classifies triangles interactively. My voting-based tester is the only Perl code anyone has committed to the repo. Perl seems totally dead as far as younger developers are concerned, which makes me a tiny bit sad. I asked the students to wrap up their triangle classifiers so we can move on.

The project that I want to assign for week 3 is only about testing. I plan to give each student a random UNIX utility (grep, sed, ed, bc, etc.) based on the premise that it is a product that their company is evaluating for production use. Each student is to perform some testing and then either argue that the tool is good enough for serious use or that it is too buggy to be adopted. It doesn’t matter which conclusion they arrive at as long as the conclusion is supported by evidence. A negative argument could be supported by bugs or deviations from the spec; a positive argument could be supported by a solid testing strategy, coverage data, etc. Of course I pointed the class to the classic Bart Miller papers evaluating the robustness of UNIX tools.

So far this class has been lots of fun and at least some of the students are enjoying it. I’ve gotten kind of obsessed with it and have been spending lots of time dreaming up projects that we probably won’t have time for.


7 responses to “Writing Solid Code Week 2”

  1. I revised my integer triangle code: https://github.com/regehr/solid_code_class/blob/master/triangle/eisenstat_int/triangle-revised.c

    The major changes were
    (1) switch to the standard library routines for integer conversion and sorting
    (2) turn that comment warning about overflow into a static assertion.
    (3) use unsigned long long everywhere (This is debatable; on one hand it’s slightly harder to overflow if someone disables my static assertion, and we get rid of a bunch of potentially error-prone casts, but on the other, unsigned wrapping is now assumed.)

    Can I rant for a minute about strtoull? I had thought that it wraps on overflow, making it useless for validation, but it actually clamps the return value and sets errno to ERANGE. That’s okay, but it also skips leading spaces and silently accepts numbers with minus signs by wrapping them, which strikes me as incongruous with clamping. In my revised code, I ended up writing a separate routine to prevalidate the string.

    The performance of the strtoull routines I’ve measured isn’t great, possibly unavoidably because they do too much.

  2. @David: Your classify_side function doesnt seem to check for d2[0] == d2[2] condition, which should result in triangle being isosceles.

    Also there’s a simplier way to check, if a triangle is “valid” : after sorting distances check, if the longest one is equal to sum of two others. If so – then triangle is invalid (either all points are the same and all values are zero, or points lie on the line).

    After this modification there’s no possibility of overflowing 64 bit unsigned integer, since You never use more that sum of two multiplications of 31 bit unsigned integers.

  3. > @David: Your classify_side function doesnt seem to check for d2[0] == d2[2] condition, which should result in triangle being isosceles.

    Contractually, d2 is sorted.

    > check, if the longest one is equal to sum of two others

    Ah, but d2 is the *squares* of the side lengths. It’s still possible to make this check in integer arithmetic*, but that would be more complicated than just computing a determinant.

    *For a sums of n square roots, we actually don’t know how to do this in polynomial time, which creates the amusing situation of there being many NP-hard geometric optimization problems (e.g., TSP) that we don’t know to be in NP.

  4. > Contractually, d2 is sorted.
    Clever. I’m blind.

    > Ah, but d2 is the *squares* of the side lengths.
    Also i’m obviously retarded.

  5. David, thanks! I share your irritation with library functions like strtoull, they never do quite the right thing. By not-coincidence, we spent some time talking about static asserts in class last Thurs!