Wednesday, October 15, 2014

The Internet of Things

In 1996, my friend Steven McGeady gave a fascinating and rather prophetic keynote address to the Harvard Conference on the Internet and Society. In his introduction, Steven said:
I was worried about speaking here, but I'm even more worried about some of the pronouncements that I have heard over the last few days, ... about the future of the Internet. I am worried about pronouncements of the sort: "In the future, we will do electronic banking at virtual ATMs!," "In the future, my car will have an IP address!," "In the future, I'll be able to get all the old I Love Lucy reruns - over the Internet!" or "In the future, everyone will be a Java programmer!"

This is bunk. I'm worried that our imagination about the way that the 'Net changes our lives, our work and our society is limited to taking current institutions and dialling them forward - the "more, better" school of vision for the future.
I have the same worries that Steven did about discussions of the Internet of Things that looms so large in our future. They focus on the incidental effects, not on the fundamental changes. Barry Ritholtz points me to a post by Jon Evans at TechCrunch entitled The Internet of Someone Else's Things that is an exception. Jon points out that the idea that you own the Smart Things you buy is obsolete:
They say “possession is nine-tenths of the law,” but even if you physically and legally own a Smart Thing, you won’t actually control it. Ownership will become a three-legged stool: who physically owns a thing; who legally owns it; …and who has the ultimate power to command it. Who, in short, has root.
What does this have to do with digital preservation? Follow me below the fold.

On a smaller scale than the Internet of Things (IoT), we already have at least two precursors that demonstrate some of the problems of connecting to the Internet huge numbers of devices over which consumers don't have "root" (administrative control). The first is mobile phones. As Jon says:
Your phone probably has three separate computers in it (processor, baseband processor, and SIM card) and you almost certainly don’t have root on any of them, which is why some people refer to phones as “tracking devices which make phone calls.”
The second is home broadband routers. My friend Jim Gettys points me to a short piece by Vint Cerf entitled Bufferbloat and Other Internet Challenges that takes off from Jim's work on these routers. Vint concludes:
I hope it’s apparent that these disparate topics are linked by the need to find a path toward adapting Internet-based devices to change, and improved safety. Internet users will benefit from the discovery or invention of such a path, and it’s thus worthy of further serious research.
Jim got sucked into working on these problems when, back in 2010, he got fed up with persistent network performance problems on his home's broadband internet service, and did some serious diagnostic work. You can follow the whole story, which continues, on his blog. But the short, vastly over-simplified version is that he discovered that Moore's Law had converted a potential problem with TCP first described in 1985 into a nightmare.

Back then, the idea of a packet switch with effectively infinite buffer storage was purely theoretical. A quarter of a century later, RAM was so cheap that even home broadband routers had packet buffers so large as to be almost infinite. TCP depends on dropping packets to signal that a link is congested. Very large buffers mean packets don't get dropped, so the sender never finds out the some link is congested, so it never slows down. Jim called this phenomenon "bufferbloat", and started a crusade to eliminate it. In less than two years, Kathleen Nichols and Van Jacobson working with Jim and others had a software fix to the TCP/IP stack, called CoDel.

CoDel isn't a complete solution, further work has produced even more fixes, but it makes a huge difference. Problem solved, right? All we needed to do was to deploy CoDel everywhere in the Internet that managed a packet buffer, which is every piece of hardware connected to it. This meant convincing every vendor of an internet-connected  device that they needed to adopt and deploy CoDel not just in new products they were going to ship, but in all the products that they had ever shipped that were still connected to the Internet.

For major vendors such as Cisco this was hard, but for vendors of consumer devices, including even Cisco's Linksys divison, it was simply impossible. There is no way for Linksys to push updates of the software to their installed base. Worse, many networking chips implement on-chip packet buffering; their buffer management algorithms are probably both unknowable and unalterable. So even though there is a pretty good fix for bufferbloat that, if deployed, would be a major improvement to Internet performance, we will have to wait for much of the hardware in the edge of the Internet to be replaced before we can get the benefit.

We know that the Smart Things the IoT is made of are full of software. That's what makes them smart. Software has bugs and performance problems like the ones Jim found. More importantly it has vulnerabilities that allow the bad guys to compromise the systems running it. Botnets assembled from hundreds of thousands of compromised home routers have been around from at least 2009 to the present. Other current examples include the Brazilian banking malware that hijacks home routers DNS settings, and the Moon worm that is scanning the Internet for vulnerable Linksys routers (who do you think would want to do that?). It isn't just routers that are affected. For example, network storage boxes have been hijacked to mine $620K worth of Dogecoin, and (PDF):
HP Security Research reviewed 10 of the most popular devices in some of the most common IoT niches revealing an alarmingly high average number of vulnerabilities per device. Vulnerabilities ranged from Heartbleed to Denial of Service to weak passwords to cross-site scripting.
Just as with bufferbloat, its essentially impossible to eliminate the vulnerabilities that enable these bad guys. It hasn't been economic for low-cost consumer product vendors to provide the kind of automatic or user-approved updates that PC and smartphone systems now routinely provide; the costs of the bad guys attacks are borne by the consumer. It is only fair to mention that there are some exceptions. The Nest smoke detector can be updated remotely; Google did this when it was discovered that it might disable itself instead of reporting a fire. Not, as Vint points out, that the remote update systems have proven adequately trustworthy:
Digital signatures and certificates authenticating software’s origin have proven only partly successful owing to the potential for fabricating false but apparently valid certificates by compromising certificate authorities one way or another.
See, for an early example, the Flame malware. Further, as Jon points out:
When you buy a Smart Thing, you get locked into its software ecosystem, which is controlled by its manufacturer, whether you like it or not.
Even valid updates are in the vendor's interest, which may not be yours.

This will be the case for the Smart Things in the IoT too. The IoT will be a swamp of malware. In Charles Stross' 2011 novel Rule 34 many of the deaths Detective Inspector Liz Cavanaugh investigates are caused by malware-infested home appliances; you can't say you weren't warned of the risks of the IoT. Jim has a recent blog post about this problem, with links to pieces he inspired by Bruce Schneier and Dan Geer. All three are must-reads.

This whole problem is another example of a topic I've often blogged about, the short-term thinking that pervades society and makes investing, or even planning, now to reap benefits or avoid disasters in the future so hard. In this case, the disaster is already starting to happen.

Finally, why is this relevant to digital preservation? I've written frequently about the really encouraging progress being made in delivering emulation in browsers and as a cloud service in ways that make running really old software transparent. This solves a major problem in digital preservation that has been evident since Jeff Rothenberg's seminal 1995 article.

Unfortunately, the really old software that will be really easy for everyone to run will have all the same bugs and vulnerabilities it had when it was new. Because old vulnerabilities, especially in consumer products, don't go away with time, attempts to exploit really old vulnerabilities don't go away either. And we can't fix the really old software to make the bugs and vulnerabilities go away, because the whole point of emulation is to run the really old software exactly the way it used to be. So the emulated system will be really, really vulnerable and it will be attacked. How are we going to limit the damage from these vulnerabilities?


David. said...

Concerns about the impact of the IoT are starting to become a meme. Gigaom ran a conference called Structure Connect that covered some of the privacy aspects of the IoT. For example:

"the so-called “third party doctrine,” which eliminates Fourth Amendment privacy protections in the event that a person freely gives private records to someone else — perhaps by sharing the activities they perform at home with a connected appliance controlled by a third-party company"

There was the usual wild optimism from vendors about how much control control users would have over their data. The idea that it could be anonymized should have been debunked by the recent NYC taxi data de-anonymization. Even more worrying is a paper pointed to by Cathy O'Neill showing how discriminatory behavior is pretty much guaranteed by "Big Data" analyses. In many cases, that's the whole point.

But the point I'm trying to make above is that all these discussions assume that the things in the IoT are actually doing what their makers intended, not what the bad guys want. You should be so lucky.

David. said...

Above, I said:

"Even valid updates are in the vendor's interest, which may not be yours."

Today, Soulskill at Slashdot points to reports that an official update to the Windows driver for FTDI's USB bridge chips bricks devices using clone chips. As Hackaday says:

"It’s a bold strategy to cut down on silicon counterfeiters on the part of FTDI. A reasonable company would go after the manufacturers of fake chips, not the consumers who are most likely unaware they have a fake chip."

This shows (a) that you may be wrong about who the vendor of your IoT device really is, and (b) trusting the "vendor" of your IoT device to act in your interests rather than theirs may be foolish.

David. said...

However, there's no need to worry:

Data generated by devices in the "internet of things" age should be "regarded and treated as personal data", data protection authorities from across the globe have agreed.

The watchdogs said it is "more likely than not" that such data can be attributed to individuals.

David. said...

FTDI has apparently backed down, possibly because Microsoft disapproved of their aggressive approach to intellectual property protection.

David. said...

The meme has made it into The Economist:

"the persistence of outdated kit leaves vulnerabilities in place. Many embedded devices, as well as older releases of Android and iOS software, simply cannot be updated—or the owners may lack the skill to update them. For those cases, the only solution that may work is simply to cut off access"

David. said...

James Boyle at The Public Domain chimes in on the risks of vendor control of IoT software.

David. said...

Richard Chirgwin at The Register points to a study at the University of Melbourne's Centre for Energy Efficient Telecommunications suggesting that the Internet of Things will be a massive energy sink. Not because of the sensors themselves, which typically need very low power for long battery life, but because of the upstream network. Very low power transmitted signals need more power at the receiver. And the IoT will cause much more upstream traffic than today's internet, burning more power in edge devices to send more traffic.

David. said...

Brian Krebs has the answer to why the Moon worm was scanning home routers. It was to build the Lizard Squad's DDOS service that was demo-ed by knocking out Sony and Microsoft's gaming networks over the holidays.

David. said...

An illustration of the low skill level needed for a major disruption of the Internet. The database in which the Lizard Squad kept the details of their more than 14K DDoS-for-hire's customers, including their passwords, was not encrypted and has now been revealed.

Advertising the service by bringing down Microsoft's and Sony's gaming networks may have backfired. At least 3 individuals are being questioned by police.