Adandoned Mineshaft

Due to my 8-year-old’s obsession with Minecraft, the abandoned mineshafts found in the game are an everyday topic of discussion around the house. He is saving up to buy a pickaxe — no joke. Since we needed a day trip for the long weekend, I thought we’d visit some actual mines in the Silver Island Mountains near the Utah-Nevada border. We followed an old road most of the way up a minor mountain and found the remains of either a processing facility or living quarters, and several mine shafts higher up. One of these apparently goes all the way through the mountain but we didn’t do much more than poke our heads in, since it was difficult to gauge how dangerous it was. This area was mined from the 1870s through the 1930s. Since the pathway connecting the mine shafts with the highest vehicle access was too narrow and steep for vehicles, we were forced to assume that ore was transported downhill using some combination of human and animal power — the level of physical effort implied here seemed to be lost on the 8-year-old, who wants a pickaxe more than ever. We saw a pair of pronghorn antelope — but failed to get decent pictures.

Lamus Peak:

Built-up trail used to move ore downhill through rugged country:

Mine shaft:

Another one, with some nice copper ore right at the entrance:

And a lot of nothing in the distance:

Instruction Synthesis is Fun and Weird

Synthesis is sort of a hot topic in PL research lately. It basically means “implement a specification automatically.” Of course, at some level this isn’t very different from what compilers have been doing for ages, if we consider the source language program to be the specification, but when we’re doing synthesis the connotation is that the specification isn’t directly implementable using standard compiler techniques.

As of recently, we have an LLVM synthesizer for Souper. The basic method is from this excellent PLDI paper. In brief, you give the synthesis algorithm a specification and a library of components and it either gives you a way to wire together the components into a circuit that implements the specification or else it fails. Some other time I might explain the algorithm in detail, but here I wanted to give an example where we’re trying to figure out if a 64-bit word contains 16 repetitions of the same nibble value (this computation figured into many nibble sort submissions).

The specification in C is:

int nibs_same(unsigned long word) {
  unsigned long n00 = (word >> 60) & 0xf;
  unsigned long n01 = (word >> 56) & 0xf;
  unsigned long n02 = (word >> 52) & 0xf;
  unsigned long n03 = (word >> 48) & 0xf;
  unsigned long n04 = (word >> 44) & 0xf;
  unsigned long n05 = (word >> 40) & 0xf;
  unsigned long n06 = (word >> 36) & 0xf;
  unsigned long n07 = (word >> 32) & 0xf;
  unsigned long n08 = (word >> 28) & 0xf;
  unsigned long n09 = (word >> 24) & 0xf;
  unsigned long n10 = (word >> 20) & 0xf;
  unsigned long n11 = (word >> 16) & 0xf;
  unsigned long n12 = (word >> 12) & 0xf;
  unsigned long n13 = (word >>  8) & 0xf;
  unsigned long n14 = (word >>  4) & 0xf;
  unsigned long n15 = (word >>  0) & 0xf;
    (n00 == n01) & (n00 == n02) & (n00 == n03) & (n00 == n04) &
    (n00 == n05) & (n00 == n06) & (n00 == n07) & (n00 == n08) &
    (n00 == n09) & (n00 == n10) & (n00 == n11) & (n00 == n12) &
    (n00 == n13) & (n00 == n14) & (n00 == n15);

I won’t bother to give the LLVM version of this but it’s not too pretty, and obviously there’s not much that standard compiler optimization techniques can do to make it prettier. On the other hand, there’s a much better way to implement this specification:

int nibs_same(uint64_t word) {
  return word == rotate_right(word, 4);

LLVM doesn’t have a rotate instruction so the best we’ll do there is something like this:

define i32 @nibs_same(i64 %word) #0 {
  %1 = shl i64 %word, 60
  %2 = lshr i64 %word, 4
  %3 = or i64 %1, %2
  %4 = icmp eq i64 %3, %word
  %5 = zext i1 %4 to i32
  ret i32 %5

The question is: Will Souper generate this code from scratch? Turns out it does not, but on the other hand it generates this:

define i32 @nibs_same(i64 %word) #0 {
  %1 = lshr i64 %word, 28
  %2 = shl i64 %1, 32
  %3 = or i64 %1, %2
  %4 = icmp eq i64 %word, %3
  %5 = zext %4 to i32
  ret i32 %5

Or in C:

(((x >> 28) << 32) | (x >> 28)) == x

If you take a minute to look at this code, you’ll see that it is not implementing a rotate, but rather something odd and not obviously correct. It took me a while to convince myself that this is correct, but I believe that it is. The correctness argument uses these observations: (1) every nibble is compared to some other nibble and (2) it is irrelevant that the bitwise-or is presented with a pair of overlapping nibbles. Here’s the picture I drew to help:

                            >> 28
                            << 32

Also Alive can help:

  %1x = shl i64 %word, 60
  %2x = lshr i64 %word, 4
  %3x = or i64 %1x, %2x
  %4x = icmp eq i64 %3x, %word
  %5 = zext i1 %4x to i32
  %1 = lshr i64 %word, 28
  %2 = shl i64 %1, 32
  %3 = or i64 %1, %2
  %4 = icmp eq i64 %word, %3
  %5 = zext %4 to i32

Done: 1
Optimization is correct!

The drawback of Souper's code is that since it doesn't include a rotate operation that can be recognized by a backend, the resulting assembly won't be as good as it could have been. Of course if LLVM had a rotate instruction in the first place, Souper would have used it. Additionally, if we posit a superoptimizer-based backend for LLVM, it would be able to find the rotate operation that is well-hidden in Souper's synthesized code.

I find it kind of magical and wonderful that a SAT solver can solve synthesis problems like this. Souper's synthesizer is very alpha and I don't recommend using it yet, but the code is here if you want to take a look. Raimondas, my heroic postdoc, did the implementation work.

I'll finish up by noting that my first blog entry was posted five years ago today. Happy birthday, blog.

Update from Feb 10: A reader sent this solution which I hadn't seen before:

(x & 0x0FFFFFFFFFFFFFFF) == (x >> 4)

I think Souper missed it since we didn't include an "and" in its bag of components for this particular run.

Static Analysis Benchmarks

Many programmers would agree that static analysis is pretty awesome: it can find code defects that are very hard to find using testing and walkthroughs. On the other hand, some scientific validation of the effectiveness of static analysis would be useful. For example, this nice 2004 paper found that when five analyzers were turned loose against some C programs containing buffer overflows, only one of them was able to beat random guessing in a statistically significant fashion.

More recently, in 2014, some researchers at the Toyota InfoTechnology Center published a paper evaluating six static analyzers using a synthetic benchmark suite containing more than 400 pairs of C functions, where each pair contains one function exemplifying a defect that should be found using static analysis and the other function, while looking fairly similar, does not contain the defect. The idea is that a perfect tool will always find the bug and not find the not-bug, if you see what I mean. The results show that different tools have different strengths; for example, CodeSonar dominates in detection of concurrency bugs but PolySpace and QA-C do very well in finding numerical bugs. With a few exceptions, the tools do a good job in avoiding false positives. At this point I feel like I have to refer Coverity’s classic writeup about this kind of thing just because it’s awesome.

Shin’ichi Shiraishi, the lead author of the Toyota paper, contacted me asking if I’d help them release their benchmarks as open source, and I have done this: the repository is here, and here is a brief document describing the benchmarks. (I just want to make it clear that while I own the Github repo, the work here was entirely done by Shin’ichi and his colleagues — I added the simple autotooling, fixed a few portability problems in the benchmarks such as using intptr_t for integers that will be cast to and from pointer values, and did some other lightweight cleanup.) Anyway, I am very happy that Shin’ichi and his organization were able to release these benchmarks and we hope that they will be useful to the community.

Update from 2/25/15: Whoops — it looks like Coverity/Synopsys has a DeWitt clause in their EULA and based on this, they sent me a cease and desist notice. In accordance with their request, I have removed Shin’ichi’s group’s paper from the Github repo and also removed the link to that paper from this piece.