Inversions in Computing


Some computer things change very slowly; for example, my newish desktop at home has a PS/2 port. Other things change rapidly: my 2010 iPad is kind of a stone-age relic now. This kind of differential progress creates some funny inversions. A couple of historical examples:

  • Apparently at one point in the 80s or 90s (this isn’t a firsthand story– I’d appreciate recollections or citations) the processor available in an Apple printer was so fast that people would offload numerical computations to their printers.
  • I spent the summer of 1997 working for Myricom. Using the then-current Pentium Pro machines, you could move data between two computers faster than you could do a local memcpy(). I’m pretty sure there was something wrong with the chipset for these processors, causing especially poor memcpy() performance, but I’ve lost the details.

What are the modern examples? A few come to mind:

Anyhow, I enjoy computing inversions since they challenge our assumptions.

,

24 responses to “Inversions in Computing”

  1. I’m not certain of the technical accuracy of this (and unfortunately can’t find an appropriate reference at the moment), but I’ve heard it said that for a brief period (mid-90s I’d guess?) the fastest x86 processor in the world was an emulator running on an Alpha.

  2. Regarding your printer tale, I used to work at Oxford uni with a professor who had written a Fast Fourier Transform in postscript. It was indeed faster to send the program to the printer than to run it on one of the NeXT workstations (it was an HP printer; NeXT printers used the postscript interpreter in the workstation).

  3. I’m not sure it fits your pattern exactly, but the specialize an OS down to a single application and run it in a VM thing seems funny to me.

  4. Sort of along these lines – Command queueing in hard drives was invented because hard drives were much slower but now it is useful because flash storage is much faster.

  5. I’ve considered a few times what it would take to implement the US tax code in PostScript:

    What do you use to do your taxes?
    My printer.

  6. I think the Apple in question was an Apple II. The 6502 processor was quite weak, and there existed solutions such as external processor cards that would give the machine more power. One option was a card with the 68000 processor. I don’t know about printers, but it does not seem unplausible that there were some printer available for the Apple II that used the 68000, sometime around the mid 80s.

  7. Network latency is another inversion.

    It’s intuitive to expect local operations to be faster than remote operations, but network latency has improved to the point that, with a bog standard network, you can do an RPC that’s faster than a spinning metal disk seek. With really fast networking (e.g., the stuff described here in http://blog.cloudflare.com/a-tour-inside-cloudflares-latest-generation-servers/), you might be able to do an RPC more quickly than an SSD seek.

  8. Of course inversions will always flip back.

    Not too long ago we started offloading CPU to general GPU. now people are already starting to offload GPU ops back to cpu.

    Thin/thick clients: Mainframes => PC => cloud

    A/sync: interrupts => threads => events => co-routines. All on the same stack!

  9. @K.Haddock: The 6502 processor weak? Pfah. The C64 had a 6502 (well, a 6510, but let’s not quibble) and this RISC-like titan was so powerful that its disk drive was outfitted with, um, a second 6502 to handle all the complicated stuff.

    That’s right, a C64 with disk drive was a dual core home computer before it was cool. Although I’ve never heard of people offloading non-disk calculations to the drive — this wouldn’t have been very practical, as the communication overhead would have probably killed most advantages.

  10. John, w.r.t.:

    > Using the then-current Pentium Pro machines, you could move data between two computers faster than you could do a local memcpy(). I’m pretty sure there was something wrong with the chipset for these processors, causing especially poor memcpy() performance, but I’ve lost the details.

    Consider that when you’re doing a memcpy, you have both read and write traffic in RAM, whereas when you’re sending it over the network you have only read traffic on one machine, and only write traffic on another. In other words, when you’re performing a non-local copy, you have twice as much memory bandwidth! And if the memory is slow while the network is extremely efficient and fast, that would explain it.

  11. The “Wheel of Reincarnation” has been around for a long time in graphics (co)processors [which tend to evolve in the direction of becoming smarter & faster, until they’re full-fledged computers, ready to have their own initially-dumb-but-later-smarter coprocesors]:

    @Article{Myer:1968:DDP,
    author = “T. H. Myer and I. E. Sutherland”,
    title = “On the Design of Display Processors”,
    journal = “Communications of the ACM”,
    year = “1968”,
    volume = “11”,
    number = “6”,
    month = jun,
    pages = “410–414”,
    }