Teaching Python Informally to Kids

For the last few months I’ve been running a “coding club” for my son’s sixth-grade class. Once a week, the interested students (about 2/3 of the class) stick around for an hour after school and I help them learn to program. The structure is basically like the lab part of a programming class, without the lecture. I originally had planned to do some lecturing but it soon became clear that at the end of a full day of school (these kids get on the bus around 7:00am, ugh) there was very little attention span remaining. So instead I decided to give them a sheet of problems to work on each week and I spend the hour walking around helping whoever needs help.

One of my main goals is to avoid making anyone hate programming. There are three parts to this. First, I’m not forcing them to do anything, but rather providing suggestions and guidance. Second, there’s no assessment going on. I’ve never been too comfortable with the grading part of teaching, actually. It can really get in the way of learning. Third, I’m encouraging them to work together. I’ve long noticed (starting when I was a CS student) that most learning occurs between students in the lab. Not as much learning happens in the lecture hall.

For a curriculum, we started out with turtle graphics, which I’ve always found to be wonderful. Python’s turtle library is clunkier than Logo and also is abysmally slow, but the advantages (visual debugging, many kids enjoy graphics) outweigh the problems. Fractals (Sierpinski triangle, dragon curve) were a good way to introduce recursion. We spent three or four weeks doing turtle stuff before moving on to simple crypto, which didn’t work as well. On the other hand, the students did really well with mathy code, here’s the handout from one of the math weeks.

Some observations:

  • One of my favorite moments so far has been when a couple of students implemented rot13 functions and used it to send rude messages to each other.
  • It’s pretty important to take a snack break.
  • Although the rockstar / 10x programmer idea is out of favor, it is absolutely clear that there is a huge amount of variation (much more than 10x) in how easily a group of twenty 10-11 year olds learn programming. Some kids are teaching themselves to open files and parse text while others are stuck on basic syntax issues.
  • Syntax, which computer professionals more or less learn to overlook, is hugely important.
  • I’ve always disliked Python’s significant whitespace and watching kids struggle with it has made me hate it even more.

Anyhow, this has been a fun teaching exercise and hopefully it is benefiting the kids.

Undefined Behavior != Unsafe Programming

Undefined behavior (UB) in C and C++ is a clear and present danger to developers, especially when they are writing code that will execute near a trust boundary. A less well-known kind of undefined behavior exists in the intermediate representation (IR) for most optimizing, ahead-of-time compilers. For example, LLVM IR has undef and poison in addition to true explodes-in-your-face C-style UB. When people become aware of this, a typical reaction is: “Ugh, why? LLVM IR is just as bad as C!” This piece explains why that is not the correct reaction.

Undefined behavior is the result of a design decision: the refusal to systematically trap program errors at one particular level of a system. The responsibility for avoiding these errors is delegated to a higher level of abstraction. For example, it is obvious that a safe programming language can be compiled to machine code, and it is also obvious that the unsafety of machine code in no way compromises the high-level guarantees made by the language implementation. Swift and Rust are compiled to LLVM IR; some of their safety guarantees are enforced by dynamic checks in the emitted code, other guarantees are made through type checking and have no representation at the LLVM level. Either way, UB at the LLVM level is not a problem for, and cannot be detected by, code in the safe subsets of Swift and Rust. Even C can be used safely if some tool in the development environment ensures that it will not execute UB. The L4.verified project does exactly this.

The essence of undefined behavior is the freedom to avoid a forced coupling between error checks and unsafe operations. The checks, once decoupled, can be optimized, for example by being hoisted out of loops or eliminated outright. The remaining unsafe operations can be — in a well-designed IR — mapped onto basic processor operations with little or no overhead. As a concrete example, consider this Swift code:

func add(a : Int, b : Int)->Int {
  return (a & 0xffff) + (b & 0xffff);
}

Although a Swift implementation must trap on integer overflow, the compiler observes that overflow is impossible and emits this LLVM IR:

define i64 @add(i64 %a, i64 %b) {
  %0 = and i64 %a, 65535
  %1 = and i64 %b, 65535
  %2 = add nuw nsw i64 %0, %1
  ret i64 %2
}

Not only has the checked addition operation been lowered to an unchecked one, but also the add instruction has been marked with LLVM’s nsw and nuw attributes, indicating that both signed and unsigned overflow are undefined. In isolation these attributes provide no benefit, but they may enable additional optimizations after this function is inlined. When the Swift benchmark suite is compiled to LLVM, about one in eight addition instructions has an attribute indicating that integer overflow is undefined.

In this particular example the nsw and nuw attributes are redundant since an optimization pass could re-derive the fact that the add cannot overflow. However, in general these attributes and others like them add real value by avoiding the need for potentially expensive static analyses to rediscover known program facts. Also, some facts cannot be rediscovered later, even in principle, since information is lost at some compilation steps.

In summary, undefined behavior in programmer-visible abstractions represents an aggressive and dangerous tradeoff: it sacrifices program correctness in favor of performance and compiler simplicity. On the other hand, UB at lower levels of the system, such as machine code or a compiler IR, is an internal design choice that needn’t have any effect on the programmer-facing parts of the system. This kind of UB simply requires us to accept that safety checks can be usefully factored out of their corresponding unsafe operations to afford efficient execution.

Nine Mile Canyon

One of my boys and I spent Sunday exploring Nine Mile Canyon, in the remote Book Cliffs a few hours drive from Salt Lake City. This canyon is known for its dense collection of rock art and ruins, a lot of which can be seen from the paved road that follows the canyon bottom. This was a lot of driving for a day trip but it wasn’t too boring since I had checked out an audiobook of Fahrenheit 451 from the library.

A great pictograph panel, the figures are several feet high.

Typical scenery in the upper canyon.

A dodgy ice bridge. This one held us but I got wet to the knees on a different stream crossing.

Fremont people in a large part of Utah drew sheep this way:

Detecting Strict Aliasing Violations in the Wild

Type-based alias analysis, where pointers to different types are assumed to point to distinct objects, gives compilers a simple and effective way to disambiguate memory references in order to generate better code. Unfortunately, C and C++ make it easy for programmers to violate the assumptions upon which type-based alias analysis is built. “Strict aliasing” refers to a collection of rules in the C and C++ standards that restrict the ways in which you are allowed to modify and look at memory objects, in order to make type-based alias analysis work in these weakly-typed languages. The problem is that the strict aliasing rules contain tricky and confusing corner cases and also that they rule out many idioms that have historically worked, such as using a pointer type cast to view a float as an unsigned, in order to inspect its bits. Such tricks are undefined behavior. See the first part of this post for more of an introduction to these issues.

The purpose of this piece is to call your attention to a new paper, Detecting Strict Aliasing Violations in the Wild, by Pascal Cuoq and his colleagues. C and C++ programmers should read it. Sections 1 and 2 introduce strict aliasing, they’re quick and easy.

Section 3 shows what compilers think about strict aliasing problems by looking at how a number of C functions get translated to x86-64 assembly. This material requires perseverance but it is worth taking the time to understand the examples in detail, because compilers apply the same thinking to real programs that they apply to tiny litmus tests.

Section 4 is about a new tool, built as part of Trust-in-Soft’s static analyzer, that can diagnose violations of the strict aliasing rules in C code. As the paper says, “it works best when applied on definite inputs,” meaning that the tool should be used as an extended checker for tis-interpreter. Pascal tells me that a release containing the strict aliasing checker is planned, but the time frame is not definite. In any case, readers interested in strict aliasing, but not specifically in tools for dealing with it, can skip this section.

Section 5 applies the strict aliasing checker to open source software. This is good reading because it describes problems that are very common in the wild today. Finding a bug in zlib was a nice touch: zlib is small and has already been looked at closely. Some programs mitigate these bugs by asking the compiler to avoid enforcing the strict aliasing rules: LLVM, GCC, and Intel CC all take a -fno-strict-aliasing flag and MSVC doesn’t implement type-based alias analysis at all. Many other programs contain time bombs: latent UB bugs that don’t happen to be exploited now, that might be exploited later when the compiler becomes a bit brighter.

Also see libcrunch and its paper and a paper about SafeType.

A Quick Look at the SoftIron OverDrive 1000

ARM processors are ubiquitous in mobile, but you won’t find a lot of developers using ARM-based boxes for their day-to-day work. Recently I needed some ARM machines for compiler work and got a few Raspberry Pi 3 boards, which seemed reasonable since these have four cores at 1.2 GHz. However, while these boards are far faster than the original Raspberry Pi, they’re still very painful for development work; the SD card interface seems to be a major bottleneck and also I would imagine that the 512 KB last-level-cache is not holding enough of the working set of a developer workload to be all that effective.

The obviously desirable ARM chip for a developer box is a ThunderX2, with up to 54 cores at 3 GHz, but these don’t seem to be available at all until later in 2017 and who knows when someone will get around to packaging these up in an affordable, developer-friendly box. The original ThunderX is available for servers but I didn’t find an inexpensive boxed-up version (there’s the Gigabyte R270-T61 that’s hard to even get a price on). Various fastish Qualcomm chips are also available on development boards but not as complete systems, as far as I could tell. I’ve done my time bringing up software environments on random development boards and am no longer interested in that.

Anyway, I settled on the SoftIron OverDrive 1000, which has pretty good specs: 8 GB of DDR4 RAM, 1 TB disk, and an AMD A1120 processor, an implementation of ARM’s Cortex-A57 design. The cores run at 1.7 GHz and each pair of cores shares a 1 MB L2 cache, with 8 MB of L3 cache shared across all four cores. The box is headless but has a USB serial console, gigabit Ethernet, SATA, and USB interfaces available. At $600 the price is right. Here’s the documentation.

The OverDrive 1000 ships with 64-bit openSUSE installed. Though I hadn’t used its zypper package manager before, it is very easy and pleasant to interact with. My one gripe with the default configuration so far is that it ships with under a GB of swap space and when doing big compiles (and especially links, and even more especially links of debug builds) it’s easy to get zapped by the OOM killer. The disk is formatted with btrfs which does not support swapfiles. Their support was responsive but didn’t have a recipe for resizing the root partition without removing the hard drive and plugging it into a different machine, which I haven’t bothered to do yet.

Now let’s look at how fast this thing is. I’m not trying to be scientific here, just giving a general idea of what to expect from this box. Since I often sit around waiting for compilers, the benchmark is going to be LLVM 4.0 rc1 compiled using itself. The test input (program being compiled) is Crypto++ 5.6.5, which I chose since it compiles fairly quickly and doesn’t seem to have a lot of external dependencies. I compiled it with the -DCRYPTOPP_DISABLE_ASM flag to disable use of assembly language that might add compile-time differences across the platforms.

The processors I’m testing are just some that happened to be convenient. There’s a machine based on a Core i7-2600, a quad-core from 2011 running at 3.4 GHz, it has hyperthreading turned off. Another is based on an i7-6950X, a 10-core from 2016 running at 3.0 GHz, it has hyperthreading turned on. Finally there’s a Macbook Pro 2.2 GHz retina model from mid-2015, it has a Core i7-4770HQ, a quad-core, also with hyperthreading turned on.

The i7-2600 and the i7-6950X run hot: they are in the 100 W range. The i7-4770HQ is rated at 47 W but this includes the GPU. I’ll speculate that perhaps the power used by the CPU part is not that different from the 25 W used by the AMD A1120 (please leave a comment if you know more about this) (update: see this comment).

First, build time using one core (i.e. make -j1):

OverDrive 1000 390 s
Macbook Pro 177 s
i7-2600 113 s
i7-6950X 139 s

So the ARM chip is about 3.5x slower than the fastest of the Intel chips.

Second, compile times using 4 cores:

OverDrive 1000 137 s
Macbook Pro 57 s
i7-2600 36 s
i7-6950X 41 s

The OverDrive 1000 gets about a 2.8x speedup from four cores. Of course some of the non-linearity is due to sequential processing in the software build; when compiling other projects, I’ve seen more like a 3.5x speedup from using four cores.

Finally, compile times using all cores:

OverDrive 1000 137 s
Macbook Pro 49 s
i7-2600 36 s
i7-6950X 19 s

So here’s the worst case slowdown of the OverDrive 1000: it’s 7.2x slower than the big Intel chip, but it’s a pretty unfair comparison since that chip costs $1,600.

Overall, the OverDrive 1000 is an inexpensive and capable machine, but it isn’t going to compete performance-wise with Intel boxes at the same price point (for example, here are the PCs you can buy at NewEgg for between $500 and $600). If you buy an OverDrive 1000, buy it because you want an ARMv8 machine that shows up ready to use and that isn’t an embedded systems toy like the Raspberry Pi family.

Introduction to Precision Farming

[My father, David Regehr, encouraged me to write this piece, provided some of its content, edited it, and agreed to let me use data from his farm.]
[For readers outside the USA: Alas, we do not farm in metric here. In case you’re not familiar with the notation, 10″ is ten inches (25.4 cm) and 10′ is ten feet (3.05 m). An acre is 0.4 hectares.]

Agriculture and technology have been intimately connected for the last 10,000 years. Right now, information technology is changing how we grow food; this piece takes a quick look at how that works.

Measurement

If soil conditions aren’t right, crops will grow poorly. For example, alfalfa grows best in soils with a pH between 6.5 and 7.5. Soils that are too acidic can be “fixed” by applying ground limestone (CaCO3) at rates determined by formulae based on chemical analysis. The process typically begins with taking soil samples (to an appropriate depth) in a zig-zag pattern across each field, mixing the samples in a bucket, and then sending a sub-sample to a laboratory where it’s analyzed for pH, cation exchange capacity, major nutrients such as phosphorus, potassium, and sulfur, and micro nutrients such as zinc. For more details, see this document about interpreting soil test results.

Applying a uniform rate of ag (agricultural) lime to an entire field is suboptimal where there is variation in soil pH within the field. Ag lime applied where it is not needed is not only a waste of money, it can raise soil pH to a point that is detrimental to crop growth. To characterize a field more accurately, it needs to be sampled at a finer granularity. For example, GPS grid lines can be super-imposed on a field to locate points each representing an area of, say, 2.5 acres. Around each such point, ten or more soil samples would be taken along a 30’ radius, mixed, sub-sampled, and GPS-tagged. From the resulting analysis, the lime requirement, and adequacy of other nutrients essential for plant growth, of all areas in the field can be interpolated using a model.

Let’s look at an example. This image shows the farm near Riley KS that my parents bought during the 1980s. I spent many afternoons and weekends working there until I moved out of the area in 1995. It’s a quarter-section; in other words, a half-mile on a side, or 160 acres. 135.5 of the acres are farmland and the remaining 24.5 are used by a creek, waterways (planted in grass to prevent erosion), buildings, and the driveway.

This image shows the points at which the fields were sampled for soil analysis in November 2015:

Each point represents a 1.25 acre area; this is pretty fine-grained sampling, corresponding to relatively small fields with terraces and other internal variation. A big, relatively homogeneous field on the high plains might only want to be sampled every 5 or 10 acres.

Here are the soil types:

This image shows how much sulfur the soil contains:

In the past it wasn’t necessary to fertilize with sulfur due to fallout from coal-burning power plants. This is no longer the case.

Another quantity that can be measured is crop yield: how much grain (or beans or whatever) is harvested at every point in a field? A combine harvester with a yield monitor and a GPS can determine this. “Point rows,” where a harvested swath comes to a point because the field is not completely rectangular, need to be specially taken into account: they cause the grain flow to be reduced not because yield is low but rather because the full width of the combine head is not being used. Yield data can be aggregated across years to look for real trends and to assess changes in how low-yield areas are treated.

Aerial measurement with drones or aircraft can be used to look for irregularities in a field: color and reflectivity at various wavelengths can indicate problems such as weeds (including, sometimes, identification of the offending species), insect infestations, disease outbreaks, and wet or dry spots. The alternative, walking each field to look for problems, is time consuming and risks missing things.

Some of the procedures in this section (maintaining a drone, intensive grid-sampling, interpreting soil test and yield results) are time-consuming and complicated, or require expensive equipment that would be poorly utilized if owned by an individual farmer. Such jobs can be outsourced to crop consultants who may be hired on a per-acre basis during the growing season to monitor individual fields for pests and nutrient problems, irrigation scheduling, etc. During the off-season, consultants may do grid sampling, attend subject-matter updates to maintain certification, and assist growers with data interpretation and planning, etc. Many crop consultants have years of experience, and see many fields every day; the services of this sort of person can reduce risks. Here’s the professional society for crop consultants and some companies that provide these services (1, 2).

Application

“Variable-rate application” means using the results of intensive soil grid sampling to apply seed, fertilizer, herbicide, insecticide, etc. in such a way that each location in the field receives the appropriate amount of whatever is being applied. For example, fewer seeds can be planted in parts of a field that have weaker capacity to store water in the soil, reducing the likelihood of drought stress.

Variable-rate can apply to an entire implement (planter or whatever) but it can also be applied at a finer granularity: for example, turning individual spray heads on and off to prevent harmful overspray or turning individual planter rows on and off to prevent gaps or double-planting on point rows and other irregularities. Imagine trying to achieve this effect using a 12-row planter without computer support:

(Image is from this slide deck.)

Here’s the soil pH for my Dad’s farm and also the recommended amount of ag lime to apply for growing alfalfa:

For cropland on this farm, 443,000 pounds (221 US tons / 201 metric tons) of ag lime are needed to bring the soils to the target pH of 6.5, the minimum pH for good alfalfa or soybean production. Purchase, hauling, and variable-rate application of ag lime in this area would be $20-25/ton, so the cost is roughly $5,000. However, because the land is farmed with no-till practices (i.e. no deep tillage to incorporate the lime), no more than about 1 ton/acre of ag lime is applied per year, so there will be a doubling or tripling of application costs, spread over several years, to some parts of the farm. Soil conditions will change in fairly predictable ways and it should be at least five years before these fields need to be sampled again.

Of course there are limits on how precisely a product can be applied to a field. Ag lime would typically be applied using a truck that spreads a 40′ swath of lime. Even if the spreader is calibrated well, there will be some error due to the width of the swath and also some error stemming from the fact that the spreader can’t instantaneously change its application rate. There might also be error due to latency in the delivery system but this could be compensated for by having the software look a few seconds ahead.

Here’s an analogous recommendation, this time for phosphorus in order to meet a target of 60 bushels per acre of winter wheat:

Phosphorus fertilizer application is an annual cost, which can vary greatly depending on type and price of formulation used. Most cropland farmers in this part of the world would figure on $25-35/acre for purchase and variable-rate application.

And finally, here’s the zinc recommendation for growing soybeans:

As you can see, much less zinc than lime is required: less than a ton of total product across the entire farm.

Automation

Driver-assist systems for cars are primarily about safety, and driverless cars need to pay careful attention to the rules of the road while not killing anyone. Automated driving solutions for tractors and harvesters seem to have evolved entirely independently and have a different focus: following field boundaries and swaths accurately.

An early automated row-following technology didn’t do any steering, but rather provided the farmer with a light bar that indicated deviation from an intended path. This was followed by autosteer mechanisms that at first just turned the steering wheel using a servo, and in modern machines issues steering commands via the power (hydraulic) steering system. The basic systems only handle driving across a field, leaving the driver to turn around at the end of each row. To use such a system you might make a perimeter pass and then a second pass around a field; this provides room to turn around and also teaches the autosteer unit about the area to be worked. Then, you might choose one edge of the field to establish the first of many parallel lines that autosteer will follow to “work” the interior of the field. Static obstacles such as trees or rocks can be marked so the GPS unit signals the driver as they’re approached. Dynamic obstacles such as animals or people are not accounted for by current autosteer systems; it’s still up to the driver to watch out for these. Autoturn is an additional feature that automates turning the tractor around at the end of the row.

Autosteer and autoturn aren’t about allowing farmers to watch movies and nap while working a field. Rather, by offloading the tiring, attention-consuming task of following a row to within a couple of inches, the farmer can monitor the field work: Is the planter performing as expected? Has it run out of seed? Autosteer also enables new farming techniques that would otherwise be unfeasible. For example, one of my cousins has corn fields in central Kansas with 30″ row spacing, that are sub-surface irrigated using lines of drip tape that are buried about 12″ deep, spaced 60″ apart. Sub-surface irrigation is far more efficient than overhead sprinkler irrigation, as it greatly reduces water loss to evaporation. As you can imagine, repairing broken drip tape is a difficult, muddy affair. So how does my cousin knife anhydrous ammonia into the soil to provide nitrogen for the corn? Very carefully, and using RTK-guidance (next paragraph) to stay within 1-2 cm of the intended path, to avoid cutting the drip lines.

GPS readings can drift as atmospheric conditions change. So, for example, after taking a lunch break you might find your autosteer-guided tractor a foot or two off of the line it was following an hour earlier. My Dad says this is commonplace, and there can be larger variance over larger time scales. Additionally, it is expected that a GPS will drop out or give erratic readings when signals reflect and when satellites are occluded by hills or trees. So how do we get centimeter-level accuracy in a GPS-based system? First, it is augmented with an inertial measurement unit: an integrated compass, accelerometer, and gyroscope. I imagine there’s some interesting Kalman filtering or similar going on to fuse the IMU readings with the GPS, but I don’t know too much about this aspect. Second, information about the location of the GPS antenna on the tractor is needed, especially the height at which it is mounted, which comes into play when the tractor tilts, for example due to driving over a terrace. Third, real-time kinematic uses a fixed base station to get very precise localization along a single degree of freedom. Often, this base station is located at the local Coop and farmers pay for a subscription. This web page mentions pricing: “Sloan Implement charges $1000 for a 1 year subscription to their RTK network per radio. If you have multiple radios on the farm, then it is $2500 for all of the radios on a single farm.”

A farm’s income depends entirely on a successful harvest. Often, harvesting is done during a rainy time of year, so fields can be too wet to harvest and in the meantime if a storm knocks the crops down, yields will be greatly reduced. Thus, as soon as conditions are right, it is imperative to get the harvest done as quickly as possible. In practice this means maximizing the utilization of the combine harvester, which isn’t being utilized when it is parked next to a grain wagon to unload. It is becoming possible to have a tractor with a grain cart autonomously pull up alongside a working combine, allowing it to unload on-the-go, without requiring a second driver.

Conclusions

The population of the world is increasing while the amount of farmland is decreasing. Precision agriculture is one of the things making it possible to keep feeding the human race at an acceptable cost. I felt that this piece needed to be written up because awareness of this material seemed low among computer science and computer engineering professionals I talk to.

Zion NP and Environs in Winter

As we enter faculty and grad recruiting season, I’d like to present a bit of Utah propaganda. No heroics are required to see this stuff: just a few hours driving from Salt Lake City (on pavement) and some mild day hiking. I’ll provide detailed instructions for visiting any of these locations upon request.