[Also see why take a compilers course and why take an embedded systems course.]
The other day, while having coffee with a colleague, I mentioned that I’ll be teaching OS in the fall. His area is far from computer systems and he asked me what’s the point of this class? What are the students supposed to get out of it? These are fair questions and I think I had reasonable answers. Here I’ll elaborate a bit; let me know if I missed anything major.
The last time I taught OS, in 2002, it was still a required course at Utah and got about 80 students. It is no longer required, having been subsumed by a great class based on the O’Hallaron and Bryant book. Even so, there are already 50 people signed up for the Fall semester offering, which seems pretty reasonable. The course is split into two sections: an advanced undergrad course and an introductory/remedial graduate student class. This is not my favorite arrangement but it’s not a big deal.
Probably I should mention that an OS course that involved a ton of Minix hacking was what got me interested in computer science. Previously, I had been sort of tolerating CS classes but hadn’t found much that was challenging or interesting.
Using a simple case analysis we can deduce that the point of an OS course is not to teach students how to write their own OS. First, there are some students interested in and capable of writing an OS. They do not need to see that material in class, they’re perfectly capable of picking it up on their own. Second, we have students incapable of or uninterested in implementing a new OS. They do not need that material either.
So what is the point?
Writing concurrent code is not easy, especially using threads with shared memory and locks. However, a large number of current CS students will do this at some point in their careers. There is a growing trend to address concurrency at points in the curriculum outside of an OS course, but even so this is traditionally the place where students get their first serious introduction to threads, races, deadlocks, and all that. The material is hard (actually the material is easy, but applying it is hard) and it would be useful to see it several times before graduating. A solid introduction to concurrent programming is a major benefit of taking an OS course.
At the hardware level resources are typically dedicated. An OS provides versions of these resources that are either virtualized (each user gets the illusion of having a copy of the resource) or arbitrated (one user at a time, but with queuing handled by the OS). The strategies used to give multiple users access to a dedicated physical resource are fundamental and are also used in many user-level programs. By studying these issues explicitly, students learn patterns that can be reused in many other contexts.
Performance Analysis and Contention Resolution
Also known as “why the #*$ is my machine paging?” When resources are shared, contention typically follows. Contention can be resolved in many ways, for example using queuing, fair sharing, or prioritization. In some cases, such as CPU scheduling, no single technique suffices and the best known solutions are bizarre hybrids. Often, the most interesting problem is in figuring out what kind of contention is the root cause of some observable problem. I spent a good part of a summer one time figuring out all the ways that Windows NT would cause an MP3 to skip. An OS is a perfect context for learning these ideas, whose applicability is much broader than computer science.
Interfaces and Hiding Complexity
A well-designed interface is a beautiful thing. It is even more beautiful to fully appreciate what it takes to transform a nasty, low-level interface (a modem or NE2000 card) into a usable and efficient high-level abstraction (a stream socket). While students should have been exposed to these ideas in course material about abstract data types, the examples given there are often pretty trivial and the full power of abstraction and complexity hiding is not necessarily apparent at that level. I consider the ability to put a collection of useful abstractions like sockets, file systems, and address spaces into a single, convenient package to be probably one of the top 10 overall contributions of computer science. This is so commonplace that it’s easy to overlook the real awesomeness.
No Magic Here
From user mode, it’s easy to view the OS as a magical force that is both good — giving us smooth multitasking, efficient storage management, etc. — and evil — giving us blue screens, thrashing, security problems, and scheduling anomalies. This view is fine for the general public. On the other hand, if you plan to tell people you’re a computer scientist, you need to have peeked behind this curtain. What will you find there? Too often it seems like sort of a sad collection of mundane linked lists, dodgy resource heuristics, and ill-maintained device drivers. A good OS course should show students that:
- There is great code in the kernel, you just have to know where to look. When you first see it, you will not understand it. But when you understand it, your mind will expand a little bit.
- Mostly, kernel code is perfectly mundane. Anyone can write it, it just takes a bit more care and attention to detail than user-mode code because the consequences of a bug are greater.
Dealing with Big Software
There is no doubt about it: being dropped into the middle of somebody else’s multi-million line code base is a nightmare. Documentation is incorrect and scattered, interfaces are klunky and wide, interactions are subtle, and error messages inscrutable. But welcome to real life, folks: we can’t usually just start over due to problems like these. As a student, if you can begin to develop a systematic approach to learning the relevant parts of a big piece of software that your code has to fit in with, your life will be a lot easier later on. You can hate the Linux kernel but it’s far better software than other stuff you’ll run into during your career.
Computer System Design
Designing any engineered system, including a software system, is an exercise in compromise. How much emphasis is placed on reliability? Performance? Cost? Maintainability? Since operating systems are large, performance-critical programs that tend to last for decades, they are a great place to learn about these kinds of tradeoffs. Students who develop a sharp eye for finding an appropriate design point are incredibly useful in industry. This stuff is more of an art than a science, you need to look at a lot of code, understand the issues, and learn to think about this stuff for yourself.
I have tried to make it clear that an OS course isn’t just about operating systems and its purpose goes well beyond giving ammunition to the UNIX / Windows / MacOS partisans. A well-taught OS course gives you skills and ways to think about computer systems that are broadly applicable even if you never touch a line of kernel code. Disregarding the fact that a CS degree at my university does not require an OS course, my opinion is that all real computer scientists either have taken this course or have picked up the equivalent skills and knowledge in some other way.
4 responses to “Why Take an Operating Systems Course?”
I really like how you highlight the fact that there is “No Magic Here”. For me, that was the greatest insight of learning how to write an OS: it is just another program.
[…] Embedded in Academia : Why Take an Operating Systems Course? – I completely agree, the title should probably be "Why Take a Hands-On Operating Systems Course" – I had an OS course at University but it was a subset of a larger systems course and very high-level theory. Would have preferred a dedicated, hands-on, OS course. […]
For me, the last part “computer system design” is the most valuable lesson from taking OS courses. As you say, it is more art than science, I hope more people spent their time in an OS course, than learning Java programming.
I also completely agree with the assertion that motivated people can easily learn how to write an OS.
The first commercial game I wrote for the Atari ST worked well, but a new version of the OS came out and it stopped working because the OS used more memory and there was not enough remaining (the base machine had 512K of RAM, but the OS used around 100K and slightly less if your program ran as a boot program, 60-70K if I remember correctly). I reduced the memory requirements, but didn’t want it to happen again, so I decided to ditch the operating system.
The next game I used, I used a routine to read from the disk directly by using the disk controller, which meant I could use 511K of the 512K available (the first 1K had hardware interrupt vectors), and they ran fine across multiple versions of the hardware that came out later.
I could also use the same assembly program for the Atari ST and Commodore Amiga (the Amiga version used hardware for scrolling and blitting sprites, and DMA for sound playback), and added another 3 weeks to the program, so I basically had a micro OS that provided just the required facilities I needed for the game.
The cartridge based consoles also had no operating system either, so those involved writing low level sound drivers that talked to the chips. One advantage of fixed hardware is that you can do far more than would be possible generically. I wrote a game for the Sega Master System, and you could download to the video controller during vblank at twice the rate as you could during non vblank. I was doing a game that required setting the hardware scroll register for each scanline (to shift the road with parallax, and as you went round a corner), so I had 229 cycles, so rather than wasting 219 of those cycles I wrote an extra 4 bytes to the video controller.