I was worried about speaking here, but I'm even more worried about some of the pronouncements that I have heard over the last few days, ... about the future of the Internet. I am worried about pronouncements of the sort: "In the future, we will do electronic banking at virtual ATMs!," "In the future, my car will have an IP address!," "In the future, I'll be able to get all the old I Love Lucy reruns - over the Internet!" or "In the future, everyone will be a Java programmer!"I have the same worries that Steven did about discussions of the Internet of Things that looms so large in our future. They focus on the incidental effects, not on the fundamental changes. Barry Ritholtz points me to a post by Jon Evans at TechCrunch entitled The Internet of Someone Else's Things that is an exception. Jon points out that the idea that you own the Smart Things you buy is obsolete:
This is bunk. I'm worried that our imagination about the way that the 'Net changes our lives, our work and our society is limited to taking current institutions and dialling them forward - the "more, better" school of vision for the future.
They say “possession is nine-tenths of the law,” but even if you physically and legally own a Smart Thing, you won’t actually control it. Ownership will become a three-legged stool: who physically owns a thing; who legally owns it; …and who has the ultimate power to command it. Who, in short, has root.What does this have to do with digital preservation? Follow me below the fold.
On a smaller scale than the Internet of Things (IoT), we already have at least two precursors that demonstrate some of the problems of connecting to the Internet huge numbers of devices over which consumers don't have "root" (administrative control). The first is mobile phones. As Jon says:
Your phone probably has three separate computers in it (processor, baseband processor, and SIM card) and you almost certainly don’t have root on any of them, which is why some people refer to phones as “tracking devices which make phone calls.”The second is home broadband routers. My friend Jim Gettys points me to a short piece by Vint Cerf entitled Bufferbloat and Other Internet Challenges that takes off from Jim's work on these routers. Vint concludes:
I hope it’s apparent that these disparate topics are linked by the need to find a path toward adapting Internet-based devices to change, and improved safety. Internet users will benefit from the discovery or invention of such a path, and it’s thus worthy of further serious research.Jim got sucked into working on these problems when, back in 2010, he got fed up with persistent network performance problems on his home's broadband internet service, and did some serious diagnostic work. You can follow the whole story, which continues, on his blog. But the short, vastly over-simplified version is that he discovered that Moore's Law had converted a potential problem with TCP first described in 1985 into a nightmare.
Back then, the idea of a packet switch with effectively infinite buffer storage was purely theoretical. A quarter of a century later, RAM was so cheap that even home broadband routers had packet buffers so large as to be almost infinite. TCP depends on dropping packets to signal that a link is congested. Very large buffers mean packets don't get dropped, so the sender never finds out the some link is congested, so it never slows down. Jim called this phenomenon "bufferbloat", and started a crusade to eliminate it. In less than two years, Kathleen Nichols and Van Jacobson working with Jim and others had a software fix to the TCP/IP stack, called CoDel.
CoDel isn't a complete solution, further work has produced even more fixes, but it makes a huge difference. Problem solved, right? All we needed to do was to deploy CoDel everywhere in the Internet that managed a packet buffer, which is every piece of hardware connected to it. This meant convincing every vendor of an internet-connected device that they needed to adopt and deploy CoDel not just in new products they were going to ship, but in all the products that they had ever shipped that were still connected to the Internet.
For major vendors such as Cisco this was hard, but for vendors of consumer devices, including even Cisco's Linksys divison, it was simply impossible. There is no way for Linksys to push updates of the software to their installed base. Worse, many networking chips implement on-chip packet buffering; their buffer management algorithms are probably both unknowable and unalterable. So even though there is a pretty good fix for bufferbloat that, if deployed, would be a major improvement to Internet performance, we will have to wait for much of the hardware in the edge of the Internet to be replaced before we can get the benefit.
We know that the Smart Things the IoT is made of are full of software. That's what makes them smart. Software has bugs and performance problems like the ones Jim found. More importantly it has vulnerabilities that allow the bad guys to compromise the systems running it. Botnets assembled from hundreds of thousands of compromised home routers have been around from at least 2009 to the present. Other current examples include the Brazilian banking malware that hijacks home routers DNS settings, and the Moon worm that is scanning the Internet for vulnerable Linksys routers (who do you think would want to do that?). It isn't just routers that are affected. For example, network storage boxes have been hijacked to mine $620K worth of Dogecoin, and (PDF):
HP Security Research reviewed 10 of the most popular devices in some of the most common IoT niches revealing an alarmingly high average number of vulnerabilities per device. Vulnerabilities ranged from Heartbleed to Denial of Service to weak passwords to cross-site scripting.Just as with bufferbloat, its essentially impossible to eliminate the vulnerabilities that enable these bad guys. It hasn't been economic for low-cost consumer product vendors to provide the kind of automatic or user-approved updates that PC and smartphone systems now routinely provide; the costs of the bad guys attacks are borne by the consumer. It is only fair to mention that there are some exceptions. The Nest smoke detector can be updated remotely; Google did this when it was discovered that it might disable itself instead of reporting a fire. Not, as Vint points out, that the remote update systems have proven adequately trustworthy:
Digital signatures and certificates authenticating software’s origin have proven only partly successful owing to the potential for fabricating false but apparently valid certificates by compromising certificate authorities one way or another.See, for an early example, the Flame malware. Further, as Jon points out:
When you buy a Smart Thing, you get locked into its software ecosystem, which is controlled by its manufacturer, whether you like it or not.Even valid updates are in the vendor's interest, which may not be yours.
This will be the case for the Smart Things in the IoT too. The IoT will be a swamp of malware. In Charles Stross' 2011 novel Rule 34 many of the deaths Detective Inspector Liz Cavanaugh investigates are caused by malware-infested home appliances; you can't say you weren't warned of the risks of the IoT. Jim has a recent blog post about this problem, with links to pieces he inspired by Bruce Schneier and Dan Geer. All three are must-reads.
This whole problem is another example of a topic I've often blogged about, the short-term thinking that pervades society and makes investing, or even planning, now to reap benefits or avoid disasters in the future so hard. In this case, the disaster is already starting to happen.
Finally, why is this relevant to digital preservation? I've written frequently about the really encouraging progress being made in delivering emulation in browsers and as a cloud service in ways that make running really old software transparent. This solves a major problem in digital preservation that has been evident since Jeff Rothenberg's seminal 1995 article.
Unfortunately, the really old software that will be really easy for everyone to run will have all the same bugs and vulnerabilities it had when it was new. Because old vulnerabilities, especially in consumer products, don't go away with time, attempts to exploit really old vulnerabilities don't go away either. And we can't fix the really old software to make the bugs and vulnerabilities go away, because the whole point of emulation is to run the really old software exactly the way it used to be. So the emulated system will be really, really vulnerable and it will be attacked. How are we going to limit the damage from these vulnerabilities?