Fun at the UNIX Terminal Part 1

This post is aimed at kids, like the 6th graders who I was recently teaching about programming in Python. It is more about having fun than about learning, but I hope that if you enjoy playing around at the UNIX terminal, you’ll eventually learn to use this kind of system for real. Keep in mind this immortal scene from Jurassic Park.

To run the commands in this post, you’ll need a UNIX machine: either Linux or Mac OS X will work. You’ll also need the ability to install software. There are two options:

  • Install precompiled binaries using a package manager, I’ll give command lines for Homebrew on OS X and for Apt on Ubuntu Linux. You’ll need administrator access to run Apt or to install Homebrew, but you do not need administrator access to install packages after Homebrew has been installed. Other versions of Linux have their own package managers and they are all pretty easy to use.
  • Build a program from source and install it in your home directory. This does not require administrator access but it’s more work and I’m not going to go into the details, though I hope to do this in a later post.

ROT13 using tr

The tr utility should be installed by default on an OS X or Ubuntu machine. It translates the characters in a string into different characters according to rules that you provide. To learn more about tr (or any other command in this post) type this command (without typing the dollar sign):

$ man tr

This will show you the UNIX system’s built-in documentation for the command.

In this and subsequent examples, I’ll show text that you should type on a line starting with a dollar sign, which is the default UNIX prompt. Text printed by the system will be on lines not starting with a dollar sign.

We’re going to use tr to encrypt some text as ROT13, which simply moves each letter forward in the alphabet by 13 places, wrapping around from Z to A if necessary. Since there are 26 letters, encrypting twice using ROT13 gives back the original text. ROT13 is fun but you would not want to use it for actual secret information since it is trivial to decrypt. It is commonly used to make it hard for people to accidentally read spoilers when discussing things like movie plot twists.

Type this:

$ echo 'Hello this is a test' | tr 'A-Za-z' 'N-ZA-Mn-za-m'
Uryyb guvf vf n grfg

Now to decrypt:

$ echo 'Uryyb guvf vf n grfg' | tr 'A-Za-z' 'N-ZA-Mn-za-m'
Hello this is a test

Just two more things before moving on to the next command.

First, the UNIX pipe operator (the “|” character in the commands above, which looks a little bit like a piece of pipe) is plumbing for UNIX commands: it “pipes” the output of one command to the input of a second command. We’ll be using it quite a bit.

Second, how exactly did we tell tr to implement ROT13? Well, the first argument, ‘A-Za-z’, gives it a set of characters to work with. Here A-Z stands for A through Z and a-z stands for a through z (computers treat the capital and lowercase versions of letters as being separate characters). So we are telling tr that it is going to translate any letter of the alphabet and leave any other character (spaces, punctuation, numbers, etc.) alone. The second argument to tr, ‘N-ZA-Mn-za-m’, specifies a mapping for the characters in the first argument. So the first character in the first argument (A) will be translated to the first character of the second argument (N), and so on. We could just as easily use tr to put some text in all uppercase or all lowercase, you might try this as an exercise.


Tragically, this command isn’t installed by default on a Mac or on an Ubuntu Linux machine. On a Mac you can install it like this:

brew install fortune

If this doesn’t work then you need to get Homebrew setup, try this page.

On Ubuntu try this:

sudo apt-get install fortune

The “sudo” command will ask you to enter your password before running the next command, apt-get, with elevated privileges, in order to install the fortune program in a system directory that you are normally not allowed to modify. This will only work if your machine has been specifically configured to allow you to run programs with elevated privileges.

In any case, if you can’t get fortune installed, don’t worry about it, just proceed to the next command.

Fortune randomly chooses from a large collection of mildly humorous quotes:

$ fortune 
I have never let my schooling interfere with my education.
		-- Mark Twain


This command is installed by default on a Mac; on Ubuntu you’ll need to type “sudo apt install gnustep-gui-runtime”.

Type this:

$ say "you just might be a genius"

Make sure you have sound turned up.

The Linux say command, for whatever reason, requires its input to be a command line argument, so we cannot use a pipe to send fortune’s output to say. So this command will not work on Linux (though it does work on OS X):

$ fortune | say

However, there’s another trick we can use: we can turn the output of one command into the command-line arguments for another command by putting the first command in parentheses and prefixing this with a dollar sign. So this will cause your computer to read a fortune aloud:

$ say $(fortune)

Another way to accomplish the same thing is to put fortune’s output into a file and then ask say to read the file aloud:

$ fortune > my_fortune.txt
$ say -f my_fortune.txt

Here the greater-than symbol “redirects” the output of fortune into a file. Redirection works like piping but the output goes to a file instead of into another program. It is super useful.

If you run “say” on both a Linux box and a Mac you will notice that the Mac’s speech synthesis routines are better.


The extremely important cowsay command uses ASCII art to show you a cow saying something. Use it like this:

$ fortune | cowsay
/ What is mind? No matter. What is \
| matter? Never mind.              |
|                                  |
\ -- Thomas Hewitt Key, 1799-1875  /
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Both Homebrew and Apt have a package called “cowsay” that you can install using the same kind of command line you’ve already been using.

Cowsay has some exciting options, such as “-d” which makes the cow appear to be dead:

$ fortune | cowsay -d
/ Laws are like sausages. It's better not \
| to see them being made.                 |
|                                         |
\ -- Otto von Bismarck                    /
        \   ^__^
         \  (xx)\_______
            (__)\       )\/\
             U  ||----w |
                ||     ||

Use “man cowsay” to learn more.


Don’t install ponysay unless you feel that cowsay is too restrictive. Also, it isn’t available as a precompiled package. You can build it from source code by first installing the “git” package using apt-get or brew and then running the following commands:

$ git clone
$ cd ponysay
$ ./ --freedom=partial --private install

This procedure puts ponysay into an odd location, but whatever. Here (assuming Linux, on a Mac you’ll need to pipe a different command’s output to ponysay) a cute pony tells us the prime factorization of a number:


Figlet (actually called FIGlet but that’s not what you type to run the command) prints text using large letters comprised of regular terminal characters. For example:

$ whoami | figlet
 _ __ ___  __ _  ___| |__  _ __ 
| '__/ _ \/ _` |/ _ \ '_ \| '__|
| | |  __/ (_| |  __/ | | | |   
|_|  \___|\__, |\___|_| |_|_|   

Figlet has lots of options for controlling the appearance of its output. For example, you can change the font:

$ echo 'hello Eddie' | figlet -f script
 _          _   _          ___                    
| |        | | | |        / (_)   |     |  o      
| |     _  | | | |  __    \__   __|   __|      _  
|/ \   |/  |/  |/  /  \_  /    /  |  /  |  |  |/  
|   |_/|__/|__/|__/\__/   \___/\_/|_/\_/|_/|_/|__/

Another command, toilet, is similar to figlet but has even more options. Install both of these programs using the same kinds of commands we’ve already been using to talk to package managers.


The UNIX “cat” program prints a file, or whatever is piped into it, to the terminal. Lolcat is similar but it prints the text in awesome colors:


The bb program doesn’t seem to be available from Homebrew, but on Ubuntu you can install it using “sudo apt-get install bb”. It is a seriously impressive ASCII art demo.


You know how lots of web sites want you to sign up using your name and address, but your parents hopefully have trained you not to reveal your identity online? Well, the rig utility can help, it creates a random identity:

$ rig
Juana Waters
647 Hamlet St
Austin, TX  78710
(512) xxx-xxxx

The zip codes and telephone area codes are even correct. For some reason rig will never generate an identity that lives in Utah.


The bc program is a calculator but unlike almost every other calculator you use, it can handle numbers of unlimited size (or, more precisely, numbers limited only by the amount of RAM in your computer) without losing precision. Try this:

$ echo '2 ^ 100' | bc

Unfortunately bc does not have a built-in factorial function but you can write one easily enough using bc’s built-in programming language. Start bc in interactive mode (this will happen by default if you don’t pipe any text into bc) by just typing “bc”. Then enter this code:

define fact(n) {
  if (n < 2) return 1;
  return n * fact(n - 1);

Now you can compute very large factorials:


While we're at it, you should figure out why the factorial of any large number contains a lot of trailing zeroes.


We've only scratched the surface, I'll share more entertaining UNIX commands in a followup post. Some of these programs I hadn't even heard of until recently, a bunch of people on Twitter gave me awesome suggestions that I've used to write this up. If you want to see a serious (but hilariously outdated) video about what you can do using the UNIX command line, check out this video in which one of the original creators of UNIX shows how to build a spell checker by piping together some simple commands.

Elk and Arch

I wanted to share a few pictures from a long, very cold snowshoe/hike I did in January. The goal was to reach a natural arch that I had previously spotted in upper City Creek Canyon. This was fun to find: I hadn’t realized there was an arch large enough to stand inside within walking distance of my house. I also saw the big herd of elk that live in this area, which doesn’t get a lot of visitation from humans other than hunters.

Do Expressive Programming Languages Always Have Undefined Behavior?

In the Hacker News comments on one of my previous posts about undefined behavior, someone said this:

AFAIK Gödel’s incompleteness theorems imply that _any_ language will have at least some undefined behaviour.

Let’s take a quick look at this statement, keeping in mind that incompleteness and undecidability can be remarkably tricky topics. Some years ago I read and enjoyed Gödel’s Theorem: An Incomplete Guide to Its Use and Abuse (and no doubt need to reread it).

First off, it is clear that there exist programming languages that are free of UB, such as one where the semantics of every program is to print “7”. Whatever it is that UB means (we have not formally defined it), it seems clear that the language that always prints “7” does not have it.

There are also useful languages that are obviously UB-free, such as an expression language that evaluates elementary functions over IEEE floats. These languages are particularly easy to reason about because they are not Turing-complete: all computations terminate and we simply have to ensure that they terminate with a defined result.

In contrast, the HN commenter may have intended to invoke Rice’s Theorem: “Any nontrivial property about the language recognized by a Turing machine is undecidable.” A consequence is that when f() is some arbitrary computation we cannot in general hope to decide whether a program like this invokes undefined behavior:

main() {
  return 1 / 0; // UB!

But this is a red herring. Rice’s Theorem only applies to non-trivial properties: “properties that apply neither to no programs nor all programs in the language.” To sidestep it, we only need to define a programming language where UB-freedom is a trivial property. This is done by ensuring that every operation that a program can perform is a total function: it is defined in all circumstances. Thus, programs in this language will either terminate in some defined state or else fail to terminate. This kind of extremely tight specification is not typically done for realistic programming languages because it is a lot of work, particularly if the language has open-ended interactions with other levels of the system, such as inline assembly. But the problem is only a practical one; there is no problem in principle. It is not too difficult to write an UB-free specification for a (Turing-complete) toy language or subset of a real language.

Now let’s return to the original HN comment: it was about the incompleteness theorems, not about the halting problem. I’m not sure what to do with that, as I don’t see that Gödel’s theorems have any direct bearing on undefined behavior in programming languages.

Mapping Mountains on Exoplanets

Snow in the mountains evolves in predictable ways that are strongly influenced by the terrain. For example, warmish snowstorms leave a characteristic snow line that looks very much like a contour on a topographic map:



On the other hand, when snow melts the effect is very different, and is often dominated by aspect. There pictures taken near my house a few days apart show more melting on south-facing slopes that receive more solar energy:

Effects like these are taken into account by a snowpack model: a computer simulation designed to predict snowpack evolution in order to, for example, make predictions about avalanches or about spring runoff. Runoff is important not only because it can cause flooding but also because areas near the mountains depend on the runoff for drinking and agricultural water. A snowpack model would typically take topographic information and snow depth information as inputs, but people have also done research into driving snowpack models using weather information; this is useful for areas where there are not a lot of snow depth measurement stations, such as the Snotel network in the western USA.

This post asks the question: could we use images of snowpacks from above to create topographic maps of exoplanets? Obviously this requires telescopes with vastly improved angular resolution, these will likely be optical interferometers in space, or perhaps on the moon. Resolution can be increased by making the telescope’s baseline longer and light-gathering can be improved using larger mirrors or by integrating over longer time periods (planets don’t evolve rapidly, I don’t see why early exoplanet images wouldn’t be aggregated from weeks or months of observations). Space telescopes involving multiple satellites are going to be expensive and there have already been some canceled missions of this type. I haven’t checked their math but this article describes the collecting area and baseline required to get images of different resolutions; unfortunately it doesn’t mention what assumptions were made about exposure times to get the images.

Assuming we can image a snowy exoplanet in sufficient detail, how can we determine its topography? We do it by making a snowpack model for the planet, probably involving a lot of educated guesses, and then invert it: instead of parameterizing the model with the topography, we instead search for the topography that is most likely to explain the observations. We could easily prototype this mapping software right now; it can be validated in two ways. First, we can take archived satellite photos of Earth, degrade them using reasonable assumptions about telescope performance and exoplanet distance, and then use them to infer the topography. Of course we have good ground truth for Earth so validation is nearly trivial. The second way to validate this mapping method is using simulated mountainous, snowy exoplanets. This would be useful to check the robustness of the methods for less-Earthlike bodies. A prototype of these ideas seems very much in reach for an ambitious PhD student.

I tried to find previous work on this topic and didn’t turn up much, I would appreciate links either in a comment here or in private email if you know of something. Let’s take a quick look at why most of the ways that we get topographic data won’t work for exoplanets:

  • The original way to make maps, getting out there and surveying the land, obviously doesn’t apply.
  • Photogrammetry requires parallax.
  • Active sensing using radar or a laser is infeasible.

On the other hand, inferring mountains’ shape from changing light and shadows over the course of a day and from day to day (if the planet’s axis is tilted) will work, and no doubt this is how we’ll first know that an exoplanet is mountainous. Snow-based techniques could be used to refine these results.

Teaching Python Informally to Kids

For the last few months I’ve been running a “coding club” for my son’s sixth-grade class. Once a week, the interested students (about 2/3 of the class) stick around for an hour after school and I help them learn to program. The structure is basically like the lab part of a programming class, without the lecture. I originally had planned to do some lecturing but it soon became clear that at the end of a full day of school (these kids get on the bus around 7:00am, ugh) there was very little attention span remaining. So instead I decided to give them a sheet of problems to work on each week and I spend the hour walking around helping whoever needs help.

One of my main goals is to avoid making anyone hate programming. There are three parts to this. First, I’m not forcing them to do anything, but rather providing suggestions and guidance. Second, there’s no assessment going on. I’ve never been too comfortable with the grading part of teaching, actually. It can really get in the way of learning. Third, I’m encouraging them to work together. I’ve long noticed (starting when I was a CS student) that most learning occurs between students in the lab. Not as much learning happens in the lecture hall.

For a curriculum, we started out with turtle graphics, which I’ve always found to be wonderful. Python’s turtle library is clunkier than Logo and also is abysmally slow, but the advantages (visual debugging, many kids enjoy graphics) outweigh the problems. Fractals (Sierpinski triangle, dragon curve) were a good way to introduce recursion. We spent three or four weeks doing turtle stuff before moving on to simple crypto, which didn’t work as well. On the other hand, the students did really well with mathy code, here’s the handout from one of the math weeks.

Some observations:

  • One of my favorite moments so far has been when a couple of students implemented rot13 functions and used it to send rude messages to each other.
  • It’s pretty important to take a snack break.
  • Although the rockstar / 10x programmer idea is out of favor, it is absolutely clear that there is a huge amount of variation (much more than 10x) in how easily a group of twenty 10-11 year olds learn programming. Some kids are teaching themselves to open files and parse text while others are stuck on basic syntax issues.
  • Syntax, which computer professionals more or less learn to overlook, is hugely important.
  • I’ve always disliked Python’s significant whitespace and watching kids struggle with it has made me hate it even more.

Anyhow, this has been a fun teaching exercise and hopefully it is benefiting the kids.

Undefined Behavior != Unsafe Programming

Undefined behavior (UB) in C and C++ is a clear and present danger to developers, especially when they are writing code that will execute near a trust boundary. A less well-known kind of undefined behavior exists in the intermediate representation (IR) for most optimizing, ahead-of-time compilers. For example, LLVM IR has undef and poison in addition to true explodes-in-your-face C-style UB. When people become aware of this, a typical reaction is: “Ugh, why? LLVM IR is just as bad as C!” This piece explains why that is not the correct reaction.

Undefined behavior is the result of a design decision: the refusal to systematically trap program errors at one particular level of a system. The responsibility for avoiding these errors is delegated to a higher level of abstraction. For example, it is obvious that a safe programming language can be compiled to machine code, and it is also obvious that the unsafety of machine code in no way compromises the high-level guarantees made by the language implementation. Swift and Rust are compiled to LLVM IR; some of their safety guarantees are enforced by dynamic checks in the emitted code, other guarantees are made through type checking and have no representation at the LLVM level. Either way, UB at the LLVM level is not a problem for, and cannot be detected by, code in the safe subsets of Swift and Rust. Even C can be used safely if some tool in the development environment ensures that it will not execute UB. The L4.verified project does exactly this.

The essence of undefined behavior is the freedom to avoid a forced coupling between error checks and unsafe operations. The checks, once decoupled, can be optimized, for example by being hoisted out of loops or eliminated outright. The remaining unsafe operations can be — in a well-designed IR — mapped onto basic processor operations with little or no overhead. As a concrete example, consider this Swift code:

func add(a : Int, b : Int)->Int {
  return (a & 0xffff) + (b & 0xffff);

Although a Swift implementation must trap on integer overflow, the compiler observes that overflow is impossible and emits this LLVM IR:

define i64 @add(i64 %a, i64 %b) {
  %0 = and i64 %a, 65535
  %1 = and i64 %b, 65535
  %2 = add nuw nsw i64 %0, %1
  ret i64 %2

Not only has the checked addition operation been lowered to an unchecked one, but also the add instruction has been marked with LLVM’s nsw and nuw attributes, indicating that both signed and unsigned overflow are undefined. In isolation these attributes provide no benefit, but they may enable additional optimizations after this function is inlined. When the Swift benchmark suite is compiled to LLVM, about one in eight addition instructions has an attribute indicating that integer overflow is undefined.

In this particular example the nsw and nuw attributes are redundant since an optimization pass could re-derive the fact that the add cannot overflow. However, in general these attributes and others like them add real value by avoiding the need for potentially expensive static analyses to rediscover known program facts. Also, some facts cannot be rediscovered later, even in principle, since information is lost at some compilation steps.

In summary, undefined behavior in programmer-visible abstractions represents an aggressive and dangerous tradeoff: it sacrifices program correctness in favor of performance and compiler simplicity. On the other hand, UB at lower levels of the system, such as machine code or a compiler IR, is an internal design choice that needn’t have any effect on the programmer-facing parts of the system. This kind of UB simply requires us to accept that safety checks can be usefully factored out of their corresponding unsafe operations to afford efficient execution.

Nine Mile Canyon

One of my boys and I spent Sunday exploring Nine Mile Canyon, in the remote Book Cliffs a few hours drive from Salt Lake City. This canyon is known for its dense collection of rock art and ruins, a lot of which can be seen from the paved road that follows the canyon bottom. This was a lot of driving for a day trip but it wasn’t too boring since I had checked out an audiobook of Fahrenheit 451 from the library.

A great pictograph panel, the figures are several feet high.

Typical scenery in the upper canyon.

A dodgy ice bridge. This one held us but I got wet to the knees on a different stream crossing.

Fremont people in a large part of Utah drew sheep this way: