Introduction to Precision Farming

[My father, David Regehr, encouraged me to write this piece, provided some of its content, edited it, and agreed to let me use data from his farm.]
[For readers outside the USA: Alas, we do not farm in metric here. In case you’re not familiar with the notation, 10″ is ten inches (25.4 cm) and 10′ is ten feet (3.05 m). An acre is 0.4 hectares.]

Agriculture and technology have been intimately connected for the last 10,000 years. Right now, information technology is changing how we grow food; this piece takes a quick look at how that works.

Measurement

If soil conditions aren’t right, crops will grow poorly. For example, alfalfa grows best in soils with a pH between 6.5 and 7.5. Soils that are too acidic can be “fixed” by applying ground limestone (CaCO3) at rates determined by formulae based on chemical analysis. The process typically begins with taking soil samples (to an appropriate depth) in a zig-zag pattern across each field, mixing the samples in a bucket, and then sending a sub-sample to a laboratory where it’s analyzed for pH, cation exchange capacity, major nutrients such as phosphorus, potassium, and sulfur, and micro nutrients such as zinc. For more details, see this document about interpreting soil test results.

Applying a uniform rate of ag (agricultural) lime to an entire field is suboptimal where there is variation in soil pH within the field. Ag lime applied where it is not needed is not only a waste of money, it can raise soil pH to a point that is detrimental to crop growth. To characterize a field more accurately, it needs to be sampled at a finer granularity. For example, GPS grid lines can be super-imposed on a field to locate points each representing an area of, say, 2.5 acres. Around each such point, ten or more soil samples would be taken along a 30’ radius, mixed, sub-sampled, and GPS-tagged. From the resulting analysis, the lime requirement, and adequacy of other nutrients essential for plant growth, of all areas in the field can be interpolated using a model.

Let’s look at an example. This image shows the farm near Riley KS that my parents bought during the 1980s. I spent many afternoons and weekends working there until I moved out of the area in 1995. It’s a quarter-section; in other words, a half-mile on a side, or 160 acres. 135.5 of the acres are farmland and the remaining 24.5 are used by a creek, waterways (planted in grass to prevent erosion), buildings, and the driveway.

This image shows the points at which the fields were sampled for soil analysis in November 2015:

Each point represents a 1.25 acre area; this is pretty fine-grained sampling, corresponding to relatively small fields with terraces and other internal variation. A big, relatively homogeneous field on the high plains might only want to be sampled every 5 or 10 acres.

Here are the soil types:

This image shows how much sulfur the soil contains:

In the past it wasn’t necessary to fertilize with sulfur due to fallout from coal-burning power plants. This is no longer the case.

Another quantity that can be measured is crop yield: how much grain (or beans or whatever) is harvested at every point in a field? A combine harvester with a yield monitor and a GPS can determine this. “Point rows,” where a harvested swath comes to a point because the field is not completely rectangular, need to be specially taken into account: they cause the grain flow to be reduced not because yield is low but rather because the full width of the combine head is not being used. Yield data can be aggregated across years to look for real trends and to assess changes in how low-yield areas are treated.

Aerial measurement with drones or aircraft can be used to look for irregularities in a field: color and reflectivity at various wavelengths can indicate problems such as weeds (including, sometimes, identification of the offending species), insect infestations, disease outbreaks, and wet or dry spots. The alternative, walking each field to look for problems, is time consuming and risks missing things.

Some of the procedures in this section (maintaining a drone, intensive grid-sampling, interpreting soil test and yield results) are time-consuming and complicated, or require expensive equipment that would be poorly utilized if owned by an individual farmer. Such jobs can be outsourced to crop consultants who may be hired on a per-acre basis during the growing season to monitor individual fields for pests and nutrient problems, irrigation scheduling, etc. During the off-season, consultants may do grid sampling, attend subject-matter updates to maintain certification, and assist growers with data interpretation and planning, etc. Many crop consultants have years of experience, and see many fields every day; the services of this sort of person can reduce risks. Here’s the professional society for crop consultants and some companies that provide these services (1, 2).

Application

“Variable-rate application” means using the results of intensive soil grid sampling to apply seed, fertilizer, herbicide, insecticide, etc. in such a way that each location in the field receives the appropriate amount of whatever is being applied. For example, fewer seeds can be planted in parts of a field that have weaker capacity to store water in the soil, reducing the likelihood of drought stress.

Variable-rate can apply to an entire implement (planter or whatever) but it can also be applied at a finer granularity: for example, turning individual spray heads on and off to prevent harmful overspray or turning individual planter rows on and off to prevent gaps or double-planting on point rows and other irregularities. Imagine trying to achieve this effect using a 12-row planter without computer support:

(Image is from this slide deck.)

Here’s the soil pH for my Dad’s farm and also the recommended amount of ag lime to apply for growing alfalfa:

For cropland on this farm, 443,000 pounds (221 US tons / 201 metric tons) of ag lime are needed to bring the soils to the target pH of 6.5, the minimum pH for good alfalfa or soybean production. Purchase, hauling, and variable-rate application of ag lime in this area would be $20-25/ton, so the cost is roughly $5,000. However, because the land is farmed with no-till practices (i.e. no deep tillage to incorporate the lime), no more than about 1 ton/acre of ag lime is applied per year, so there will be a doubling or tripling of application costs, spread over several years, to some parts of the farm. Soil conditions will change in fairly predictable ways and it should be at least five years before these fields need to be sampled again.

Of course there are limits on how precisely a product can be applied to a field. Ag lime would typically be applied using a truck that spreads a 40′ swath of lime. Even if the spreader is calibrated well, there will be some error due to the width of the swath and also some error stemming from the fact that the spreader can’t instantaneously change its application rate. There might also be error due to latency in the delivery system but this could be compensated for by having the software look a few seconds ahead.

Here’s an analogous recommendation, this time for phosphorus in order to meet a target of 60 bushels per acre of winter wheat:

Phosphorus fertilizer application is an annual cost, which can vary greatly depending on type and price of formulation used. Most cropland farmers in this part of the world would figure on $25-35/acre for purchase and variable-rate application.

And finally, here’s the zinc recommendation for growing soybeans:

As you can see, much less zinc than lime is required: less than a ton of total product across the entire farm.

Automation

Driver-assist systems for cars are primarily about safety, and driverless cars need to pay careful attention to the rules of the road while not killing anyone. Automated driving solutions for tractors and harvesters seem to have evolved entirely independently and have a different focus: following field boundaries and swaths accurately.

An early automated row-following technology didn’t do any steering, but rather provided the farmer with a light bar that indicated deviation from an intended path. This was followed by autosteer mechanisms that at first just turned the steering wheel using a servo, and in modern machines issues steering commands via the power (hydraulic) steering system. The basic systems only handle driving across a field, leaving the driver to turn around at the end of each row. To use such a system you might make a perimeter pass and then a second pass around a field; this provides room to turn around and also teaches the autosteer unit about the area to be worked. Then, you might choose one edge of the field to establish the first of many parallel lines that autosteer will follow to “work” the interior of the field. Static obstacles such as trees or rocks can be marked so the GPS unit signals the driver as they’re approached. Dynamic obstacles such as animals or people are not accounted for by current autosteer systems; it’s still up to the driver to watch out for these. Autoturn is an additional feature that automates turning the tractor around at the end of the row.

Autosteer and autoturn aren’t about allowing farmers to watch movies and nap while working a field. Rather, by offloading the tiring, attention-consuming task of following a row to within a couple of inches, the farmer can monitor the field work: Is the planter performing as expected? Has it run out of seed? Autosteer also enables new farming techniques that would otherwise be unfeasible. For example, one of my cousins has corn fields in central Kansas with 30″ row spacing, that are sub-surface irrigated using lines of drip tape that are buried about 12″ deep, spaced 60″ apart. Sub-surface irrigation is far more efficient than overhead sprinkler irrigation, as it greatly reduces water loss to evaporation. As you can imagine, repairing broken drip tape is a difficult, muddy affair. So how does my cousin knife anhydrous ammonia into the soil to provide nitrogen for the corn? Very carefully, and using RTK-guidance (next paragraph) to stay within 1-2 cm of the intended path, to avoid cutting the drip lines.

GPS readings can drift as atmospheric conditions change. So, for example, after taking a lunch break you might find your autosteer-guided tractor a foot or two off of the line it was following an hour earlier. My Dad says this is commonplace, and there can be larger variance over larger time scales. Additionally, it is expected that a GPS will drop out or give erratic readings when signals reflect and when satellites are occluded by hills or trees. So how do we get centimeter-level accuracy in a GPS-based system? First, it is augmented with an inertial measurement unit: an integrated compass, accelerometer, and gyroscope. I imagine there’s some interesting Kalman filtering or similar going on to fuse the IMU readings with the GPS, but I don’t know too much about this aspect. Second, information about the location of the GPS antenna on the tractor is needed, especially the height at which it is mounted, which comes into play when the tractor tilts, for example due to driving over a terrace. Third, real-time kinematic uses a fixed base station to get very precise localization along a single degree of freedom. Often, this base station is located at the local Coop and farmers pay for a subscription. This web page mentions pricing: “Sloan Implement charges $1000 for a 1 year subscription to their RTK network per radio. If you have multiple radios on the farm, then it is $2500 for all of the radios on a single farm.”

A farm’s income depends entirely on a successful harvest. Often, harvesting is done during a rainy time of year, so fields can be too wet to harvest and in the meantime if a storm knocks the crops down, yields will be greatly reduced. Thus, as soon as conditions are right, it is imperative to get the harvest done as quickly as possible. In practice this means maximizing the utilization of the combine harvester, which isn’t being utilized when it is parked next to a grain wagon to unload. It is becoming possible to have a tractor with a grain cart autonomously pull up alongside a working combine, allowing it to unload on-the-go, without requiring a second driver.

Conclusions

The population of the world is increasing while the amount of farmland is decreasing. Precision agriculture is one of the things making it possible to keep feeding the human race at an acceptable cost. I felt that this piece needed to be written up because awareness of this material seemed low among computer science and computer engineering professionals I talk to.

A Few Pictures

Paris was very quiet on Saturday and people on the street looked tired, having (like us) stayed up most of the night watching the news, listening to sirens, and worrying about things. Today was sunny and warm and things seemed more normal; plenty of folks out jogging, sitting in parks, other usual weekend activities.

Classic Bug Reports

A bug report is sometimes entertaining either because of the personalities involved or because of the bug itself. Here are a collection of links into public bug trackers; I learned about most of these in a recent Twitter thread.

I hope you enjoy these as much as I do. Thanks to everyone who contributed links.

Updates from comments and Reddit:

Booster Test

Ever since learning that the space shuttle booster motors were manufactured and tested at ATK in Promontory Utah — not too far from where I live — I wanted to see one of the tests. I didn’t manage to do that before the shuttle program was shut down, but today I got to see something better: a test of an SLS booster, which is about 25% more powerful than an STS booster and more than twice as powerful as one of the big F-1 engines from the Saturn V.

Here’s a close-up video. On the other hand, this one shows what the test was like from the viewing area, in particular the 8 seconds it took the noise to reach us. The sound was very impressive, with enough low-frequency power to make my clothing vibrate noticeably, but it was not anywhere close to painfully loud. The flame was, however, painfully bright to look at. The nozzle was being vectored around during the test (I hadn’t realized that the solid rockets participate in guidance) but that wasn’t easy to see from a distance.

NASA socials give some inside access to people like me (and you, if you live in the USA and want to sign up next time) who have no official connection to the space program. Yesterday we got to tour the plant where the boosters are made. It was great to learn about techniques for mixing, casting, and curing huge amounts of propellant without getting air bubbles or other imperfections into the mix and without endangering workers. The buildings in this part of ATK have escape slides from all levels and are surrounded by big earthworks to deflect potential explosions upwards. It was also really cool to see the hardware for hooking boosters to the main rocket, for vectoring nozzles, and things like that. Alas, we weren’t allowed to take pictures on the tour.

ATK’s rocket garden at sunrise:

And the main event:

Inversions in Computing

Some computer things change very slowly; for example, my newish desktop at home has a PS/2 port. Other things change rapidly: my 2010 iPad is kind of a stone-age relic now. This kind of differential progress creates some funny inversions. A couple of historical examples:

  • Apparently at one point in the 80s or 90s (this isn’t a firsthand story– I’d appreciate recollections or citations) the processor available in an Apple printer was so fast that people would offload numerical computations to their printers.
  • I spent the summer of 1997 working for Myricom. Using the then-current Pentium Pro machines, you could move data between two computers faster than you could do a local memcpy(). I’m pretty sure there was something wrong with the chipset for these processors, causing especially poor memcpy() performance, but I’ve lost the details.

What are the modern examples? A few come to mind:

Anyhow, I enjoy computing inversions since they challenge our assumptions.

What I Accomplished in Grad School

I often talk to students who are thinking about grad school. The advice I generally give is a dressed-up version of “Just do whatever the hell will make you happy.” But if we all had solid ideas about what would make us happy then, well, we’d probably be a lot more happy. Here’s a list of things that I actually accomplished in grad school. Most of these things did make me happy or at least were satisfying. Of course, I cannot know the extent to which these things would make other people happy, and I also cannot know whether I would have been happier with the things that I’d have accomplished if I hadn’t gone to grad school. Since I got a PhD 13 years ago and started the program 18.5 years ago (crap!) I have at least a modest amount of perspective at this point.

First, some work-related things.

  • I became pretty good at doing and evaluating research.

  • I started to become good at writing. When I arrived at grad school I was not a good writer. When I left, I was not good either, but at least I was on the way. Since 2001, every time I write something, I have been thankful that it’s not a PhD thesis.

  • I wrote a few pretty decent papers. None of them set the world afire, but none of them has been a source of embarrassment either.

  • I did some internships in industry and, along the way, learned a bit about how the real world works, if such a thing can be said to exist.

But really, the things in grad school that weren’t about work were better:

  • I read a lot of books, often several per week. I’m afraid that I’m going to have to get the kids out of the house and also retire if I want to reach that level again.

  • I found someone to spend the rest of my life with. This was the purest luck.

  • I made a number of friends who I am still close to, though we don’t talk nearly often enough. I doubt that I’ll ever have another group of friends as good as these.

  • I became quite good at disc golf.

  • I did a decent amount of programming for fun.

  • I avoided going into debt. In fact, the TA and RA stipends that I received in grad school felt like a lot of money compared to the ~$7000/year that I lived on as an undergrad.

There are a bunch of things that are important that I did not accomplish in grad school:

  • I failed to learn even rudimentary time management.

  • I did not develop good eating, drinking, sleeping, or exercise habits. When I graduated I was under the impression that my body could tolerate almost any sort of abuse.

  • I didn’t learn to choose good research topics, this took several more years.

  • I didn’t figure out what I wanted to do with my life.

I put this out there on the off chance that it might be useful for people who are thinking about grad school.

Automatically Entering the Grand C++ Error Explosion Competition

G++ can be comically verbose; developers sometimes like to wallpaper their cubes with choice error messages from Boost or STL programs. The Grand C++ Error Explosion Competition asks the question: how large can we make the ratio between error output and compiler input?

I’m not much of a C++ person but when the contest was announced I was doing some experiments in using C-Reduce as way to search for C++ programs that have interesting properties. Of course, we usually use C-Reduce to search for small programs, but Alex and I have been using it (and other reducers) to find, for example, programs that cause interesting parts of the compiler to execute. It only took a minute or two to setup C-Reduce so that its goal was to maximize the GCEEC’s fitness function. I started it running on four C++ files; after a few days three of the reductions didn’t show signs of terminating but the fourth one — some random part of the LLVM backend — reduced to this:

struct x0 struct A<x0(x0(x0(x0(x0(x0(x0(x0(x0(x0(_T1,x0 (_T1> <_T1*, x0(_T1*_T2> 
binary_function<_T1*, _T2, x0{ }

Somewhat surprisingly, there aren’t any templates here. When compiled using G++ 4.8.1 (I’m using the one that comes with Ubuntu 13.10 on x86-64) we get 5 MB of output. It wasn’t too hard to (1) clean up this output a bit and (2) recognize that the repeated (x0 substring is important. Thus, my entry to the GCEEC was:

struct x struct z<x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(x(y,x(y><y*,x(y*w>v<y*,w,x{}

Every added (x approximately doubles the size of the error output. It was tricky to choose the right number of these substrings to include since I wanted to bump up against the timeout without pushing past it. But really, at this point the competition became a lot less interesting because we can pick a target ratio of output to input and trivially craft an input that reaches the target (assuming we don’t run into implementation limits). So the contest is basically a bandwidth contest where the question is: How many bytes can we get out of G++ on the specified platform within the 5 minute timeout? At this point the winner depends on how many cores are available, the throughput of Linux pipes, etc., which isn’t too satisfying.

I was a little bummed because I didn’t need to use a trick I had been saving up, which was to give the C++ file a name that is 255 characters long — this is useful because the name of the source file is repeated many times in the error output (and the length of the source file name is not part of the fitness function). However, it was delightful to read the other contest entries which used some nice tricks I wouldn’t have thought of.

Would it be fun to repeat this contest for Clang++ or MSVC++? Also, why is G++ so verbose? My guess is that its error reporting logic should be running (to whatever extent this is possible) much earlier in the compilation process, before templates and other things have been expanded. Also, it would probably be useful to enforce a limit on the length of any given error message printed by the compiler on the basis that nobody is interested in anything past the first 10 KB or whatever.