There are four main types of entity motivated to violate your privacy:
- Companies: who can monetize this information directly by selling it and indirectly by exploiting it in their internal business. Tim Wu's The Attention Merchants: The Epic Scramble to Get Inside Our Heads is a valuable overview of this process, as is Maciej Cegłowski's What Happens Next Will Amaze You.
- Governments: both democratic and authoritarian governments at all levels from nations to cities are addicted to violating the privacy of citizens and non-citizens alike, ostensibly in order to "keep us safe", but in practice more to avoid loss of power. Parts of Wu's book cover this too, but it at least since Snowden's revelations it has rarely been far from the headlines.
- Criminals: can be even more effective at monetizing your private information than companies.
- Users: you are motivated to give up your privacy for trivial rewards:
More than 70% of people would reveal their computer password in exchange for a bar of chocolate, a survey has found.
CompaniesCliff Lynch has a long paper up at First Monday entitled The rise of reading analytics and the emerging calculus of reader privacy in the digital world:
It discusses what data is being collected, to whom it is available, and how it might be used by various interested parties (including authors). I explore means of tracking what’s being read, who is doing the reading, and how readers discover what they read.Many months ago Cliff asked me to review a draft, but the final version differs significantly from the draft I reviewed. Cliff divides the paper into four sections:
- Introduction: Who’s reading what, and who knows what you’re reading?
- Collecting data
- Exploiting data
- Some closing thoughts
Cliff agrees in less dramatic language with Maciej Cegłowski's Haunted by Data, and his analogy between stored data and nuclear waste:
Those trying to protect reader privacy gradually realized that the best guarantee of such privacy was to collect as little data as possible, and to retain what had to be collected as briefly as possible. The hard won lesson: if it exists, it will ultimately be subpoenaed or seized, and used against readers in steadily less measured and discriminating ways over time.Cliff notices, as did Sam Kome, that readers are now tracked at the page level:
One of the byproducts of this transformation is a major restructuring of ideas and assumptions about reader privacy in light of the availability of information about what is being read, who is reading it, and (a genuinely new development) exactly how it is being read, including the end to frustrating reliance upon purchase, borrowing, or downloading as surrogate indicators for actually reading the work in question. ... one might wish for more than sparse anecdote on the ways and extents to which very detailed data on how a given book is (or is not) read, and by whom, actually benefits the various interested parties: authors, publishers, retailers, platform providers, and even readers.Cliff points out an important shift in the rhetoric about privacy:
Historically, most of the language has been about competing values and how they should be prioritized and balanced, using charged and emotional phrases: “reader privacy,” “intellectual freedom,” “national security,” “surveillance,” “accountability,” “protecting potential victims” ... These conversations are being supplanted by a sterile and anodyne, value-free discussion of “analytics:” reader analytics, learning analytics, etc. These are presented as tools that smart and responsible modern organizations are expected to employ; indeed, not doing analytics is presented as suggesting some kind of management failure or incompetence in many quarters. The operation of analytics systems, ... tends to shift discussions from whether data should be collected to what we can do with it, and further suggests that if we can do something with it, we should.Privacy is among the reasons readers have for using ad-blockers; the majority of the bytes they eliminate are not showing you ads but implementing trackers. The Future of Ad Blocking: An Analytical Framework and New Techniques by Grant Storey, Dillon Reisman, Jonathan Mayer and Arvind Narayanan reports on several new ad-blocking technologies, including one based on laws against misleading advertising:
ads must be recognizable by humans due to legal requirements imposed on online advertising. Thus we propose perceptual ad blocking which works radically differently from current ad blockers. It deliberately ignores useful information in markup and limits itself to visually salient information, mimicking how a human user would recognize ads. We use lightweight computer vision techniques to implement such a tool and show that it defeats attempts to obfuscate the presence of ads.They are optimistic that ad-blockers will win out:
Our second key observation is that even though publishers increasingly deploy scripts to detect and disable ad blocking, ad blockers run at a higher privilege level than such scripts, and hence have the upper hand in this arms race. We borrow ideas from rootkits to build a stealthy adblocker that evades detection. Our approach to hiding the presence and purpose of a browser extension is general and might be of independent interest.I don't agree. The advent of DRM for the Web requires that the DRM implementation run at a higher privilege level than the ad-blocker, and that it prevent less-privileged code observing the rendered content (less it be copied). It is naive to think that advertisers will not notice and exploit this capability.
GovernmentsAs usual, Maciej Cegłowski describes the situation aptly:
We're used to talking about the private and public sector in the real economy, but in the surveillance economy this boundary doesn't exist. Much of the day-to-day work of surveillance is done by telecommunications firms, which have a close relationship with government. The techniques and software of surveillance are freely shared between practitioners on both sides. All of the major players in the surveillance economy cooperate with their own country's intelligence agencies, and are spied on (very effectively) by all the others.Steven Bellovin, Matt Blaze, Susan Landau and Stephanie Pell have a 101-page review of the problems caused by the legacy model of communication underlying surveillance law in the Harvard Journal of Law and Technology entitled Its Too Complicated: How The Internet Upends Katz, Smith and Electronic Surveillance Law. Its clearly important but I'm only a short way into it, I may have more to say about it later.
And this, of course, assumes that the government abides by the law. Marcy Wheeler disposes of that idea:
All of which is to say that the authority that the government has been pointing to for years to show how great Title VII is is really a dumpster fire of compliance problems.and also:
And still, we know very little about how this authority is used.
one reason NSA analysts were collecting upstream data is because over three years after DOJ and ODNI had figured out analysts were breaking the rules because they forgot to exclude upstream from their search, they were still doing so. Overseers noted this back in 2013!
CriminalsThe boundaries between government entities such as intelligence agencies and law enforcement and criminals have always been somewhat fluid. The difficulty of attributing activity on the Internet (also here) to specific actors has made them even more fluid:
Who did it? Attribution is fundamental. Human lives and the security of the state may depend on ascribing agency to an agent. In the context of computer network intrusions, attribution is commonly seen as one of the most intractable technical problems, as either solvable or not solvable, and as dependent mainly on the available forensic evidence. But is it? Is this a productive understanding of attribution? — This article argues that attribution is what states make of it.The most important things to keep private are your passwords and PINs. They're the primary target for the bad guys, who can use them to drain your bank accounts. Dan Goodin at Ars Technica has an example of how incredibly hard it is to keep them secret. In Meet PINLogger, the drive-by exploit that steals smartphone PINs, he reports on Stealing PINs via mobile sensors: actual risk versus user perception by Maryam Mehrnezhad, Ehsan Toreini, Siamak F. Shahandashti and Feng Hao. Goodin writes:
The demonstrated keylogging attacks are most useful at guessing digits in four-digit PINs, with a 74-percent accuracy the first time it's entered and a 94-percent chance of success on the third try. ... The attacks require only that a user open a malicious webpage and enter the characters before closing it. The attack doesn't require the installation of any malicious apps.Malvertising, using ad servers to deliver malware, is a standard technique for the bad guys, and this attack can use it:
"That means whenever you are typing private data on a webpage [with] some advert banners ... the advert provider as part of the page can 'listen in' and find out what you type in that page," ... "Or with some browsers as we found, if you open a page A and then another page B without closing page A (which most people do) page A in the background can listen in on what you type in page B."
UsersBecause it is effectively impossible for you to know what privacy risks you are running, you are probably the main violator of your privacy on the Internet, for two main reasons:
- You have explicitly and implicitly agreed to Terms of Service (and here) that give up your privacy rights in return for access to content. Since the content probably isn't that important to you, your privacy can't be that important either.
- You have not taken the simple precautions necessary to maintain privacy by being anonymous when using the Web. Techniques such as cookie syncing and browser fingerprinting mean that even using Tor isn't enough. Even though Tor obscures your IP address, if you're using the same browser as you did without Tor or when you logged in to a site, the site will know its you. Fortunately, there is a very simple way to avoid these problems. Tails (The Amnesic Incognito Live System) can be run from a USB flash drive or in a VM. Every time it starts up it is in a clean state. The browser looks the same to a Web site as every other Tails browser. Use it any time privacy is an issue, from watching pr0n to searching for medical information.
Update:At The Atlantic, Arvind Narayanan and Dillon Reisman's The Thinning Line Between Commercial and Government Surveillance reports:
As part of the Princeton Web Transparency and Accountability Project, we’ve been studying who tracks you online and how they do it. Here’s why we think the fight over browsing histories is vital to civil liberties and to a functioning democracy.They stress the effectiveness of the tracking techniques I mentioned above:
Privacy doesn’t merely benefit individuals; it fundamentally shapes how society functions. It is crucial for marginalized communities and for social movements, such as the fight for marriage equality and other once-stigmatized views. Privacy enables these groups to network, organize, and develop their ideas and platforms before challenging the status quo. But when people know they’re being tracked and surveilled, they change their behavior. This chilling effect hurts our intellectual freedoms and our capacity for social progress.
Web tracking today is breathtaking in its scope and sophistication. There are hundreds of entities in the business of following you from site to site, and popular websites embed about 50 trackers on average that enable such tracking. We’ve also found that just about every new feature that’s introduced in web browsers gets abused in creative ways to “fingerprint” your computer or mobile device. Even identical looking devices tend to behave in subtly different ways, such as by supporting different sets of fonts. It’s as if each device has its own personality. This means that even if you clear your cookies or log out of a website, your device fingerprint can still give away who you are.And that, even if used by companies, governments (and ISPs) can piggy-back on them:
Worse, the distinction between commercial tracking and government surveillance is thin and getting thinner. The satirical website The Onion once ran a story with this headline: “CIA's ‘Facebook’ Program Dramatically Cut Agency's Costs.” Reality isn’t far off. The Snowden leaks revealed that the NSA piggybacks on advertising cookies, and in a technical paper we showed that this can be devastatingly effective. Hacks and data breaches of commercial systems have also become a major part of the strategies of nation-state actors.Ironically, The Atlantic's web-site is adding tracking information to their article's URL (note the 524592):
>https://www.theatlantic.com/technology/archive/2017/05/the-thinning-line-between-commercial-and-government-surveillance/524952/and to the attributes of the links in it:
data-omni-click="r'article',r'link',r'6',r'524952'"At Gizmodo, Kashmir Hill's Uber Doesn’t Want You to See This Document About Its Vast Data Surveillance System is a deep dive into the incredibly detailed information Uber's database maintains about each and every Uber user. It is based on information briefly revealed in a wrongful termination lawsuit, before Uber's lawyers got it sealed.
For two days in October, before Uber convinced the court to seal the material, one of Spangenberg’s filings that was publicly visible online included a spreadsheet listing more than 500 pieces of information that Uber tracks for each of its users. ...Both articles are must-reads.
For example, users give Uber access to their location and payment information; Uber then slices and dices that information in myriad ways. The company holds files on the GPS points for the trips you most frequently take; how much you’ve paid for a ride; how you’ve paid for a ride; how much you’ve paid over the past week; when you last canceled a trip; how many times you’ve cancelled in the last five minutes, 10 minutes, 30 minutes, and 300 minutes; how many times you’ve changed your credit card; what email address you signed up with; whether you’ve ever changed your email address.