Planning for Disaster

Alan Perlis once said:

I think that it’s extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don’t think we are.

This is a nice sentiment, perhaps even a defensible one if we interpret it as narrowly talking about academic computer science. On the other hand, probably within 20 years, there’s going to be a major disaster accompanied by the loss of many lives that is caused more or less directly by software defects. I don’t know what exactly will happen, but something related to transportation or SCADA seems likely. At that point we can expect things to hit the fan. I’m not optimistic that, as a group, computer scientists and computing professionals can prevent this disaster from happening: the economic forces driving automation and system integration are too strong. But of course we should try. We also need to think about what we’re going to do if, all of a sudden, a lot of people suddenly expect us to start producing computer systems that actually work, and perhaps hold us accountable when we fail to do so.

Obviously I don’t have the answers but here are a few thoughts.

  • We know that it is possible to create safety-critical software that (largely) works. Generally this happens when the organizations creating the software are motivated not only by market forces, but also by significant regulatory pressure. Markets (and the humans that make them up) are not very good at analyzing low-probability risks. A huge amount of critical software is created by organizations that are subject to very little regulatory pressure.
  • It is difficult to tell people that something is a bad idea when they very much want it to be a good idea. We should get used to doing this, following Parnas’s courageous example.
  • It is difficult to tell people that something is going to be slow and expensive to create, when they very much want it to be quick and cheap. We need to get used to saying that as well.
  • We can and should take responsibility for our work. I was encouraged by the field’s generally very positive response to The Moral Character of Cryptographic Work. Computer scientists and computing professionals almost always think that their particular technology makes the world better or is at worst neutral — but that is clearly not always the case. Some of this could be taught.
  • We need to be educating CS students in methods for creating software that works: testing, specification, code review, debugging, and formal methods. You’d think this is obvious but students routinely get a CS degree without having done any sort of serious software testing.

Finally, let’s keep in mind that causes are tricky. A major human disaster usually has a complex network of causes, perhaps simply because any major disaster with a single cause would indicate that the system had been designed very poorly. Atomic Accidents makes it clear that most nuclear accidents have been the result of an anti-serendipitous collection of poor system design, unhappy evolution and maintenance, and failure-enhancing responses by humans.

Acknowledgments: Pascal Cuoq and Kelly Heller gave some great feedback on drafts of this piece.

Is the Browser the New OS?

Yes, this is an old question. I still think it’s interesting. Disclaimer: I haven’t tried out a Chromebook yet.

First, let’s look at the situation as of late 2012. The applications I use generally fall into three categories:

  1. Web-based.
  2. Native, but easily available on Windows, Mac, and Linux. These include a file browser, a shell, Emacs, LaTeX.
  3. Native and less portable: Photoshop, development tools such as compilers and debuggers, high-performance games.

A quick look at the Chrome store indicates that most or all of the applications in category #2 are already available in browser versions (though I currently use the native versions out of inertia). Category #3 is trickier; it’s not clear that it really makes sense to port, for example, Photoshop into the browser. On the other hand, category #3 is basically power tools and many people (including me, much of the time) can get by using weaker web-based versions. Even embedded software development tools, which might at first seem to be the antithesis of web applications, have web-based versions. In summary, to a first approximation, “the browser is the new OS” has mostly happened already, though we do see an interesting reverse trend in the mobile space (though quite a few of those native apps are thin wrappers around a browser).

Second, how did we get to this point? The first application I ever saw that used a browser as its primary interface was SATAN in 1995. I remember thinking this was an elegant and striking design decision; it permitted the implementors to focus on their application logic instead of wasting time on what would likely have ended up being the kind of crappy X windows GUI that was standard at the time. Not long after 1995 we saw the rise of Java-enabled browsers making everyone (for a short time, at least) get interested in the potential for platform-independent applications delivered over the web. But this potential remained unrealized. Around 2008 it became clear that the following three efforts, taken together, constituted a serious attempt to create a new platform. First, Google Chrome, with a fast JavaScript engine and significantly increased robustness (and perhaps also security) due to the process-per-tab model. Second, Google Gears, which facilitated disconnected operation for web applications. And third, Google Docs, which implements at least 80% of the useful stuff in MS Office. Of course Gears is gone but similar functionality is available elsewhere. More recently, Chrome OS and the Chromebook make it clear what Google wants to see. I’m painting a sort of Google-centric picture here and in fact I use their stuff a lot. However, most of the time I could get by using Firefox, Bing, and other alternative technologies.

Finally, what is the prognosis for browser-is-the-OS, looking forward? What will it take for people to really not care about which OS they’re running? First, we want to minimize the number of applications in category #3, but realistically I think most casual users don’t have that many apps in this category and us power users are willing to compromise. For example, the special-purpose tools I use for research are probably always going to run outside the browser and that is fine—they’ll run on workstation-class machines at the office and in the cloud. Photoshop and high-quality video processing is not likely to be moving into the cloud real soon, but on the other hand again these are special-purpose, workstation-grade applications. Most people already do their photo editing online.

The second thing that I require is that any web application that looks remotely like an editor (whether for programs, documents, presentations, web pages, or whatever) has to support rock-solid disconnected operation and have a sensible conflict resolution mechanism. Google Docs’ disconnected operation seems pretty good since last summer, but I worry that lots of applications are going to need special-purpose logic to handle this nicely, and it’s going to be a long slog since many developers won’t consider disconnected to be a priority.

Third, we need near-native performance in the browser. Plenty of JavaScript work has been done and it’s getting pretty fast. For more compute-intensive workloads Google Native Client seems like a good option.

In summary, we seem to be headed towards a segregated world where most people don’t need a PC, but rather an appliance that runs a browser. On the other hand, people who use computers more seriously will keep using workstation-style machines that run native apps when considerations such as performance, dealing with large amounts of data, security, reliability, and local control are paramount. I’ll go ahead and predict that in 2022, the number of PCs sold will be 25% of what they were in 2012. By PC I mean something like “non-Android, non-iOS machine primarily intended to be used by a single person to run native applications”—a category that includes all laptops except Chromebooks.

Core Question

[This post is about machines used by people. I realize things are different in the server room.]

We had one core per socket for a long time. When multi-cores came along, dual core seemed pretty awkward: real concurrency was possible, but with speedup bounded above by two, there wasn’t much point doing anything trickier than “make -j2”. Except in low-end machines two cores seems to have been a passing phase. Now, several years later, it is possible to buy desktop processors with six or eight cores, but they do not seem to be very common or popular. However, I will definitely spend some time working for a 4x speedup, so stalling there may not be such a shame. Even some inexpensive tablets are quad core now. But are we just pausing at four cores for another year or two, or is this going to be a stable sweet spot? If we are stuck at four, there should be a reason. A few random guesses:

  • Desktop workloads seldom benefit much from more than four cores.
  • Going past four cores puts too much of a squeeze on the number of transistors available for cache memory.
  • Above four cores, DRAM becomes a significant bottleneck.
  • Above four cores, operating systems run into scalability problems.

None of these limitations is fundamental, so perhaps in a few years four cores will be low-end and most workstations will be 16 or 32?

Cyber War

I recently read Richard Clarke’s Cyber War. Although I didn’t learn anything new on the technical side, that isn’t the focus of the book. Clarke’s main agenda is to build awareness of the uniquely vulnerable position that the United States finds itself in as well as proposing national policies that might lead to a more favorable position for the USA as well as a more stable situation for everyone. Although I know next to nothing about Clarke, over the course of the book I learned to admire his blunt opinions and the broad perspective he has developed as a long-time Washington insider. This book is a couple of years old, and therefore misses out on recent developments such as Stuxnet. Even so, I’m not aware of a better high-level introduction to the policy issues. It’s worth reading as a way to understand some of the broader implications of computer (in)security.

Discovering New Instructions

Sometimes I wonder what instruction sets are supposed to look like. That is, what instructions would there be if computers were redesigned by smart people who understood our fabrication capabilities and who knew what we wanted to accomplish using computers, but who didn’t care about backwards compatibility and who haven’t seen our architectures? We can get little glimpses into that world by looking at network processors, DSPs, GPUs, ISA extensions like SSE4 and NEON, extensible processors like Tensilica’s, and others. But still, these are too rooted in the past.

Although the machine code emitted by our compilers is inextricably tied to our processors, perhaps this code can still be leveraged to discover new instructions. As a thought experiment, let’s start with a collection of executables whose performance we care about. Preferably, some of these will have been compiled from programs in Haskell, OCaml, and other languages that are not well-represented in today’s benchmark suites. We’ll run these programs in a heavily instrumented execution environment that creates a dynamic dataflow graph for the computation; the excellent Redux paper shows how this can be done. Next, we’ll need to clean up the dataflow graphs. First, we rewrite processor-specific operations (condition code dependencies, CISC instructions, etc.) into a simpler, portable form. Next, we optimize away as much dead, redundant, and vacuous code as possible; including, hopefully, all irrelevancies such as stack frame manipulation, dynamic linking, and garbage collection. The result — perhaps — will be something beautiful: the essence of the original computation, stripped of all sequential constraints, processor-specific operations, and bookkeeping. Of course this dataflow graph has some limitations. First, it only encodes the meaning of a single execution of the program. Second, it encodes a lot of incidental facts about the computation such as the bitwidths of all integer operations, the specific hashing methods used, etc. We’ll just have to live with these problems. The Redux paper contains a great example where factorial codes written in C, in Haskell, and in a stack machine are shown to all produce basically the same dataflow graph.

So now we have a collection of giant dataflow graphs: one for each execution of each program that we’re interested in. Our goal is to design an instruction set that can compute these dataflow graphs. Trivially, this can be done by partitioning the graphs into very small units of computation corresponding to a RISC instruction set. But that ISA is boring and won’t show any performance wins. To do better we’ll use a search-based strategy to find subgraphs that:

  • occur a large number of times across the collection of dataflow graphs — these are operations that are executed frequently by our workloads
  • contain a lot of parallelism — making them good candidates for hardware acceleration
  • contain a lot of internal symmetry — supporting SIMD-style execution
  • have a small number of dynamic inputs and outputs
  • rely on a small number of constants
  • do not contain dependency chains that are too long — we don’t want to create instructions that are too slow

I think this can be done; none of these properties seems particularly difficult to test for. The major problem necessitating cleverness will be the huge size of the dataflow graphs. We’ll end up with a list of candidate instructions ranked by some fitness function, such as performance or code density. We can build as many of these into our new ISA as we have hardware budget for.

Would this method discover saturating arithmetic instructions when applied to signal processing codes? Would it find clever ways to optimize bounds checks and exception handling in high-level programming programming languages? It’s possible (though I’d be super disappointed) that the new machines are just incidental variations on existing RISC and CISC designs. If this happened, I would suspect that we had failed to abstract away a sufficient number of processor artifacts. Or, perhaps it was a mistake to compile our computations to an existing machine architecture before building the dataflow graphs. Rather, perhaps we should start with a source-level program and its operational semantics, unrolling it into a dataflow graph without any compilation to machine code. This avoids ties to our existing processors, but also risks coming up with graphs that are very hard to map back onto actual hardware. Of course, many languages don’t even have a real semantics, but researchers are diligently working on that sort of thing. An even more difficult option would build up the functional representation of a source program (or executable) without running it, but this has the disadvantage of losing the “profile data” that is built into a dynamic dataflow graph — we’d need to add that in separately.

An aspect of this exercise that I find interesting is that gives insight into what our processors really do. Many years ago (I don’t have a reference handy, unfortunately) a study showed that computers spend most of their time in sorting algorithms. That cannot be true anymore — but what does the average mobile phone do? What about the average supercomputer chip? The average machine in Google or Amazon’s cloud? Of course we know the answers at a high level, but are there surprises lurking inside the computations? I would expect so — it’s tough to take a close look at a complicated system and not learn something new. Are there new instructions out there, waiting to be discovered, that can help these workloads execute more efficiently? I have to think so, at least for for workloads that are not well-represented in Intel’s, AMD’s, and ARM’s benchmark suites.

Can Simplicity Scale?

Software has gotten really big, with many systems — even, apparently, cars — running into the hundreds of millions of lines of code. The drawbacks of code bases this large are numerous: they are hard to understand, hard to modify, hard to test, and virtually guaranteed to contain huge numbers of bugs. My understanding is that up to this point, we have survived principally by burying mountains of code under abstraction layers, hiding much of the complexity from the next layer in the software stack. This is fine but it only gets us so far: large codes tend to export quirky, unclean abstractions and of course bugs cause even more abstraction leaks. We should be exploring alternative approaches that might be able to radically reduce the size of our code bases.

This annual report describes an ongoing effort to implement Frank, a full software stack for an interactive desktop machine, using less than 20,000 lines of code. This is a great project and I love the fact that they’ve made an annual report — typically an internal-only document required by a funding agency — publicly available. Another nice aspect of this project is that creating an innovative GUI is explicitly not a goal; they simply want to mimic existing systems using 1000 times fewer lines of code.

The technical approach, in a nutshell, is to design a number of domain-specific programming languages, each suited to concisely expressing some part of the system. For example, consider this text from the report which describes the core of Frank’s windowing system, which is written in the Nile language:

The several dozen standard compositing rules, shading, stroking, gradients, sampling, shadowing, etc.—457 lines in total—were written in Nile and debugged to make a working graphics system, strong enough to be used for Frank, and to do all the graphics required for personal computing (and hence this report).

The Nile code can be found here. What is really being talked about — thought the report doesn’t use the term — is executable specifications. Specifications describe how a system should operate; they are written in English, mathematics, pseudocode, etc. and in general they can’t be executed efficiently (or at all). On the other hand, if the domain and the specification language are constrained, it should be possible to create executable specifications. Even in cases where the simple version of the specification cannot be executed efficiently, this approach supports the separation of optimizations from the core specification, making it easier to reason about their correctness.

Although I love this research project, at some point we have to ask ourselves if the basic goal — a 1000x reduction in lines of code, or a full system in 20 KLOC — will survive contact with the enemy. To play devil’s advocate for a moment, I suspect that a minimal OS + GUI + applications could be written in 20,000 lines of code even in C++ if the design was careful, the feature set limited, and the implementers talented. To continue with that line of thought: How much of Frank’s elegance will survive performance tuning, porting to many platforms, feature creep, internationalization and related hacks, mediocre developers, security hardening, and all of the other things that happen to software as it is pushed into production and then used by real people to accomplish complex tasks?

My guess is that a significant fraction of Frank’s gains go away in the face of real-world engineering concerns. However, even if we are left with a 10x or 100x reduction in code size, instead of the original 1000x, the exercise is worthwhile. The thing that worries me most about pushing a system like Frank into the real world is that the hidden costs are larger than we might hope. Let’s make an analogy with microkernels. Speaking abstractly, there is much to love about, for example, a 10 KLOC kernel that has been proved correct. However, people whose judgement I trust have said that (1) debugging code running on a microkernel can be extremely hard and (2) creating high performance code requires very talented developers. The first problem stems mainly from concurrency, the second from the fact that in a microkernel-based system, an arbitrary piece of data is no longer a single pointer indirection away. It seems that a Frank-like system is likely to have similar problems. Creating a clean, coherent, and efficient system using a big pile of little languages, some of them highly mathematical, probably requires serious talent. In contrast, if I’m a medium-grade developer and you give me a big, crappy mountain of C++, I can just start hacking and probably I’ll eventually get the desired effect — even if it is fragile and impossible to maintain. The is a classic worse-is-better situation — which has nothing to do with “worse” or “better”, but rather is about the conflict between two value systems: one which prizes cleanliness and clarity, the other which values pragmatic code that works efficiently in practice and can be easily evolved to solve new problems.

In summary, I desperately hope that we won’t be seeing, in a few years, 50 billion lines of code on the desktop and in the automobile. This future seems all too plausible and we should be actively working to avoid it. Approaches such as Frank are promising. The Racket project has a significant amount of overlap with Frank (though its top-level goals are different, or at least differently stated); see for example the recent CACM article by my colleague Matthew Flatt.

Towards Tinkers

The heroes of Vernor Vinge’s The Peace War are members of a scattered society of tinkers who — without any real industrial base — manage to develop and produce very high-tech devices including fast, small computers. I’m trying to figure out how realistic this is.

The software side seems entirely feasible. Today’s open source community has shown that small groups of talented, motivated people can create enormously sophisticated artifacts.

On the other hand, hardware is a problem. Vinge’s tinkers cannot exist in today’s world where fast computers are made only in billion-dollar fabs. Perhaps it will become possible to grow sophisticated chips using hacked bacteria, or maybe new fabrication technologies such as 3D printing will evolve to the point where they can produce realistic large-scale integrated circuits.

An end point for 3D printing technology is the Diamond Age‘s matter compiler, which can produce almost anything based on a plan and a “feed” of energy and atoms. But of course in that book, the feed is a centrally controlled resource. The “seed” technology from the end of The Diamond Age is tinker-friendly, but very far-fetched for now.

The recent near-doubling of hard disk prices due to flooding in Thailand shows how fragile our supply chains are. It’s nice to think that alternate versions of some high-tech products could be produced by small, autonomous groups of people.

Online University

Yesterday someone in my department’s main office got a request from a student to receive credit for taking the now-infamous free online AI course from Stanford. It is routine for a university to award transfer credit for a course taken at a different school, but this case is trickier since a student taking the AI course isn’t enrolled at Stanford and doesn’t get credit there. This post — which will be disorganized because my thinking on this subject is not yet organized — looks at what the Stanford course, the Khan academy, MIT’s Open Courseware initiative, and related efforts might mean for the future of college education.

Will there be a single, best course filling every niche, and everyone just takes that course? The analogy I’d like to make is between online courses and textbooks. Most subject areas are not dominated by a single textbook, but there is usually a small collection of textbooks that, together, are used as a basis for probably 80% of the courses. Personally I’d much rather learn from a textbook than from an online course — listening to someone talk is exceptionally inefficient. Why didn’t mass-market textbooks wipe out universities sometime during the 20th century? Because, of course, taking a class adds value beyond what can be found in the book. This value takes many forms:

  • A course comes as part of a broader “college experience” that many people want.
  • A course is part of an eventual degree that serves as a kind of certification.
  • Instructors are available to provide additional help.
  • Putting classmates in close proximity creates a sense of community and at least in some cases promotes learning.
  • A course is often part of a curriculum that has been designed in an integrated way.
  • A course serves as a forcing function, making it more difficult to put off learning the material.

I think we have to accept the conclusion that universities as we understand them today will be destroyed more or less to the extent that these sources of value can be provided by online education.

Let’s look at a couple of extremes. First, a course like Calculus I — a big lecture course at most universities. The experience of trying to learn integration while sitting in a big, crowded lecture is so bad that watching the lecture online almost seems attractive. It’s not hard to imagine these courses going away over the next 20 years. There seem to be various possibilities for how this will play out. First, a big university could offer an online version of Calc I, but this is very inefficient because only a few hundred or thousand people take it each year. Rather, courses like this will be handled by a few large organizations (companies or forward-thinking universities) and most institutions will simply contract out Calc I by giving some fraction of the tuition to the course provider. Course providers will make money by scaling — first through outsourcing and increasingly through AI-based techniques for assisting and assessing students. My fear is that these online courses will suck very badly, in the same way that so many web applications suck today. However, realistically, the not-online Calc I course I took 20 years ago sucked too: lectures were boring and recitation was at 7:30am with a TA I truly could not understand.

At the other extreme from big service courses, we have things like the “Senior Software Project” course offered by many departments or the Android Projects class that I’m teaching now. These have a large amount of instructor involvement, a large amount of in-person interaction between students, grading cannot easily be automated, etc. I don’t want to say that online versions of these classes are impossible, but certainly they would have a very different character than the current versions. These courses represent the part of a college education that directly descends from the old apprenticeship system and it would — in the long run — be a big problem if this part of the college experience went away. Of course the most serious students would still apprentice themselves through hackerspaces, internships, and such — but people in for example the 50th through 80th percentiles would likely be left poorly prepared for their future careers by an online-only education.

The picture I am painting is that at least in the near term, universities and traditional college degrees survive, but some of their bread and butter gets eaten by highly scalable providers of low-level courses. There will be fierce competition among providers — similar to the current competition between textbook providers, but the stakes will be higher. As we move forward, some fraction of students — in particular, non-traditional students and those who otherwise don’t want the traditional college experience — will move towards online-only degree programs. At first these will provide an inferior education and therefore they will be sought out by students who just cannot make regular classes work, or who are primarily interested in a degree for its own sake. Perhaps, as time passes, telepresence and related technologies will manage to become solid enough that a real education can be attained online.

A Fire Upon The Deep — Retrospective and E-book

Over the last few weeks I read A Fire Upon The Deep, surely one of the top five works of computer science fiction. The proximate reason for the re-read was the upcoming release of a sequel, Children of the Sky, which I am impatiently awaiting.

I read the “special edition” which contains about 1500 of the author’s annotations. This was a bit of a mixed bag. First, there are so many notes that it became tiresome to flip (electronically) back and forth between them and the main text. Second, a large fraction of the annotations are irrelevant because they apply to obsolete drafts, they contain formatting information for the publisher, or they are just too terse or arcane to figure out. Since Vinge went through the trouble of coding his annotations so they would be greppable, it would have been great if the special edition had exploited these tags — for example, by giving me the option of ignoring notes about typesetting. Around 10% of the annotations contain interesting material such as background information or deleted text; these show that Vinge worked very hard to make the story consistent and physically plausible, and that he was already thinking about the sequel 20 years ago.

One of the fun conceits in A Fire is that galaxy-scale bandwidth and latency constraints force most communication to look and feel just like Usenet did around 1990. While some of Vinge’s Usenet-centric humor (in particular the netnews headers) has become stale, much of it works if translated into terms of today’s message boards. In particular, the “net of a million lies” aspect is a lot more relevant today than it was a few decades ago.

Vinge is probably best known for coining the term technological singularity based on the idea that as a society’s technical prowess increases, progress becomes ever-faster until so much change occurs in such a short time that meaningful predictions about the end state are impossible. This notion does not play a role in A Fire. I’d argue that Vinge’s vision of the future of computer security would be a more appropriate lasting legacy than his thoughts about the singularity. This thread is present in most of his work, but in A Fire it is played up at a very large scale, with entire civilizations being wiped out or worse due to inappropriate network security measures. I shouldn’t need to add that we’re a lot closer to this situation now than we were when the book was written. This sequence stuck in my head even the first time I read the book:

The new Power had no weapons on the ground, nothing but a comm laser. That could not even melt steel at the frigate’s range. No matter, the laser was aimed, tuned civilly on the retreating warship’s receiver. No acknowledgment. The humans knew what communication would bring. The laser light flickered here and there across the hull, lighting smoothness and inactive sensors, sliding across the ship’s ultradrive spines. Searching, probing. The Power had never bothered to sabotage the external hull, but that was no problem. Even this crude machine had thousands of robot sensors scattered across its surface, reporting status and danger, driving utility programs. Most were shut down now, the ship fleeing nearly blind. They thought by not looking that they could be safe.

One more second and the frigate would attain interstellar safety.

The laser flickered on a failure sensor, a sensor that reported critical changes in one of the ultradrive spines. Its interrupts could not be ignored if the star jump were to succeed. Interrupt honored. Interrupt handler running, looking out, receiving more light from the laser far below…. a backdoor into the ship’s code, installed when the newborn had subverted the humans’ groundside equipment….

…. and the Power was aboard, with milliseconds to spare. Its agents — not even human equivalent on this primitive hardware — raced through the ship’s automation, shutting down, aborting. There would be no jump. Cameras in the ship’s bridge showed widening of eyes, the beginning of a scream. The humans knew, to the extent that horror can live in a fraction of a second.

There would be no jump. Yet the ultradrive was already committed. There would be a jump attempt, without automatic control a doomed one. Less than five milliseconds till the jump discharge, a mechanical cascade that no software could finesse. The newborn’s agents flitted everywhere across the ship’s computers, futilely attempting a shutdown. Nearly a light-second away, under the gray rubble at the High Lab, the Power could only watch. So. The frigate would be destroyed.

Apparently all bets are off when Satan is putting back doors in your critical control code.

Aside from the self-imposed problem of looking at every annotation, reading A Fire Upon the Deep on an iPad was a fairly friendly experience. Resolution and contrast were adequate. I liked how easy it was to listen to music while reading. I probably wouldn’t have been able to read in full sunlight, but then again I dislike reading regular books in full sunlight. The iPad is quite a bit heavier than a paperback and it has an annoying way of deciding to change the page orientation when I didn’t want that to happen. I’d consider reading another e-book but am not in a hurry.

One reason I read SF is that it helps me learn to think about alternate and future scenarios. This requires the author not only to have good ideas, but to be able to follow through with a world built on the consequences of those ideas. In terms of his ability to do these things, Vinge is one of the best modern SF authors. Only 50 days until Children of the Sky comes out…

Does a Simulation Really Need to Be Run?

At some point we’ll be able to run a computer simulation that contains self-aware entities. In this piece I’m not going to worry about little details such as how to tell if a simulated entity is self-aware or whether it’s even possible to run such a simulation. The goal, rather, is to look into some philosophical problems posed by simulations.

A computer simulation takes a model in some initial configuration and evolves the state of the model through repeated programmatic application of rules. Usually we run a simulation in order to better understand the dynamics of some process that is hard to study analytically or experimentally. The straightforward way to implement a simulator is to represent the system state in some explicit fashion, and to explicitly run the rule set on every element of the state at every time step. It seems clear that if our simulation includes a self-aware entity, this sort of execution is sufficient to breathe life into the entity. But what about other implementation options?

First, our simulator might be mechanically optimized by a compiler or similar tool that would combine rules or elide rules in certain situations. For example, if the simulation state contains a large area of empty cells, the optimizer might be able to avoid running the rule set at all in that part of the space. Can the entity being simulated “feel” or otherwise notice the compiler optimizations? No — as long as the compiler is correct, its transformations are semantics preserving: they do not affect the computation being performed. Of course a suitable definition of “do not effect” has to be formulated; typically it involves defining a set of externally visible program behaviors such as interactions with the operating system and I/O devices.

(animation is from Wikipedia)

Compiler optimizations, however, are not the end of the story — algorithmic optimizations can be much more aggressive. I’ll use Conway’s Game of Life as the example. First a bit of background: Life is a two-state, rectangular cellular automaton governed by these rules (here I’m quoting from Wikipedia):

  • Any live cell with fewer than two live neighbors dies, as if caused by under-population.
  • Any live cell with two or three live neighbors lives on to the next generation.
  • Any live cell with more than three live neighbors dies, as if by overcrowding.
  • Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction.

Life has been shown to be Turing complete, so clearly self-aware entities can be encoded in a Life configuration if they can be algorithmically simulated at all. A straightforward implementation of Life maintains two copies of the Life grid, using one bit per cell; at every step the rules are applied to every cell of the old grid, with the results being placed into the new grid. At this point the old and new grids are swapped and the simulation continues.

Hashlife is a clever optimization that treats the Life grid as a hierarchy of quadtrees. By observing that the maximum speed of signal propagation in a Life configuration is one cell per step, it becomes possible to evolve squares of the Life grid multiple steps into the future using hash codes. Hashlife is amazing to watch: it starts out slow but as the hashtable fills up, it suddenly “explodes” into exponential progress. I recommend Golly. Hashlife is one of my ten all-time favorite algorithms.

The weird thing about Hashlife is that time progresses at different rates at different parts of the grid. In fact, two cells that are separated by distance n can be up to n time steps apart. Another interesting thing is that chunks of the grid may evolve forward by many steps without the intermediate steps being computed explicitly. Again we ask: can self-aware Life entities “feel” the optimization? It would seem that they cannot: since the rules of their universe are not being changed, their subjective experience cannot change. (The Hashlife example is from Hans Moravec’s Mind Children.)

If Hashlife is a viable technology for simulating self-aware entities, can we extend its logic to simply replace the initial simulation state with the final state, in cases where we can predict the final state? This would be possible for simulated universes that provably wind down to a known steady state, for example due to some analogue of the second law of thermodynamics. The difference between Hashlife and “single simulation step to heat death” is only a matter of degree. Does this fact invalidate Hashlife as a suitable algorithm for simulating self-aware creatures, or does it imply that we don’t actually need to run simulations? Perhaps to make a simulation “real” it is only necessary to set up its initial conditions. (Greg Egan’s Permutation City is about this idea.)

Aside: Why don’t we have processor simulators based on Hashlife-like ideas? Although ISA-level simulators do not have the kind of maximum speed of signal propagation seen in cellular automata, the programs they run do have strong spatial and temporal locality, and I bet it’s exploitable in some cases. I’ve spent long hours waiting for simulations of large collections of simple processors, daydreaming about all of the stupid redundancy that could be eliminated by a memoized algorithm.

(image is from here)

Techniques like Hashlife only go so far; to get more speedup we’ll need parallelism. In this scenario, the simulation grid is partitioned across processors, with inter-processor communication being required at the boundaries. An unfortunate thing about a straightforward implementation of this kind of simulation is that processors execute basically in lock step: at least at the fringes of the grid, no processor can be very far ahead of its neighbors. The synchronization required to keep processors in lock step typically causes slowdown. Another ingenious simulation speedup (developed, as it happens, around the same time as Hashlife) is Time Warp, which relaxes the synchronization requirements, permitting a processor to run well ahead of its neighbors. This opens up the possibility that a processor will at some point receive a message that violates causality: it needs to be executed in the past. Clearly this is a problem. The solution is to roll back the simulation state to the time of the message and resume execution from there. If rollbacks are infrequent, overall performance may increase due to improved asynchrony. This is a form of optimistic concurrency and it can be shown to preserve the meaning of a simulation in the sense that the Time Warp implementation must always return the same final answer as the non-optimistic implementation.

Time Warp places no inherent limit on the amount of speculative execution that may occur — it is possible that years of simulation time would need to be rolled back by the arrival of some long-delayed message. Now we have the possibility that a painful injury done to a simulated entity will happen two or more times due to poorly-timed rollbacks. Even weirder, an entity might be dead for some time before being brought back to life by a rollback. Depending on the content of the message from the past, it might not even die during re-execution. Do we care about this? Is speculative execution amoral? If we suppress time warping due to moral considerations, must we also turn off processor-level speculative execution? What if all of human history is a speculative mistake that will be wiped out when the giant processor in the sky rolls back to some prehistoric time in order to take an Earth-sterilizing gamma ray burst into account? Or perhaps, on the other hand, these speculative executions don’t lead to “real experiences” for the simulated entities. But why not?

The Hashlife and Time Warp ideas, pushed just a little, lead to very odd implications for the subjective experience of simulated beings. In question form:

First, what kind of simulation is required to generate self-awareness? Does a straightforward simulator with an explicit grid and explicit rule applications do it? Does Hashlife? Is self-awareness generated by taking the entire simulation from initial state to heat death in a single step? Second, what is the effect of speculative execution on self-aware entities? Do speculative birth and death, pain and love count as true experiences, or are they somehow invalid?

These questions are difficult — or silly — enough that I have to conclude that there’s something fundamentally fishy about the entire idea of simulating self aware entities. (Update: Someone told me about the Sorites paradox, which captures the problem here nicely.) Let’s look at a few possibilities for what might have gone wrong:

  1. The concept of self-awareness is ill-defined or nonsensical at a technical level.
  2. It is not possible to simulate self-aware entities because they rely on quantum effects that are beyond our abilities to simulate.
  3. It is not possible to simulate self-aware entities because self-awareness comes from souls that are handed out by God.
  4. We lack programming languages and compilers that consider self-awareness to be a legitimate side-effect that must not be optimized away.
  5. With respect to a simulation large enough to contain self-aware entities, effects due to state hashing and speculation are microscopic — much like quantum effects are to us — and their effects are necessarily insignificant at the macroscopic level.
  6. All mathematically describable systems already exist in a physical sense (see Max Tegmark’s The Mathematical Universe — the source of the title of this piece) and therefore the fire, as it were, has already been breathed into all possible world-describing equations. Thus, while simulations give us windows into these other worlds, they have no bearing on the subjective experiences of the entities in those worlds.

The last possibility is perhaps a bit radical but it’s the one that I prefer: first because I don’t buy any of the others, and second because it avoids the problem of figuring out why the equations governing our own universe are special — by declaring that they are not. It also makes simulation arguments moot. One interesting feature of the mathematical universe is that even the very strange universes, such as those corresponding to simulations where we inject burning bushes and whatnot, have a true physical existence. However, the physical laws governing these would seem to be seriously baroque and therefore (assuming that the multiverse discourages complexity in some appropriate fashion) these universes are fundamentally less common than those with more tractable laws.