Thursday, May 18, 2017

"Privacy is dead, get over it" [updated]

I believe it was in 1999 that Scott McNealy famously said "privacy is dead, get over it". It is a whole lot deader now than it was then. A month ago in Researcher Privacy I discussed Sam Kome's CNI talk about the surveillance abilities of institutional network technology such as central wireless and access proxies. There's so much more to report on privacy that below the fold there can't be more than some suggested recent readings, as an update to my 6-month old post Open Access and Surveillance. [See a major update at the end]

There are four main types of entity motivated to violate your privacy:
  • Companies: who can monetize this information directly by selling it and indirectly by exploiting it in their internal business. Tim Wu's The Attention Merchants: The Epic Scramble to Get Inside Our Heads is a valuable overview of this process, as is Maciej Cegłowski's What Happens Next Will Amaze You.
  • Governments: both democratic and authoritarian governments at all levels from nations to cities are addicted to violating the privacy of citizens and non-citizens alike, ostensibly in order to "keep us safe", but in practice more to avoid loss of power. Parts of Wu's book cover this too, but it at least since Snowden's revelations it has rarely been far from the headlines.
  • Criminals: can be even more effective at monetizing your private information than companies.
  • Users: you are motivated to give up your privacy for trivial rewards:
    More than 70% of people would reveal their computer password in exchange for a bar of chocolate, a survey has found.


Cliff Lynch has a long paper up at First Monday entitled The rise of reading analytics and the emerging calculus of reader privacy in the digital world:
It discusses what data is being collected, to whom it is available, and how it might be used by various interested parties (including authors). I explore means of tracking what’s being read, who is doing the reading, and how readers discover what they read.
Many months ago Cliff asked me to review a draft, but the final version differs significantly from the draft I reviewed. Cliff divides the paper into four sections:
  1. Introduction: Who’s reading what, and who knows what you’re reading?
  2. Collecting data
  3. Exploiting data
  4. Some closing thoughts
You should read the whole thing, but here are a few tastes.

Cliff agrees in less dramatic language with Maciej Cegłowski's Haunted by Data, and his analogy between stored data and nuclear waste:
Those trying to protect reader privacy gradually realized that the best guarantee of such privacy was to collect as little data as possible, and to retain what had to be collected as briefly as possible. The hard won lesson: if it exists, it will ultimately be subpoenaed or seized, and used against readers in steadily less measured and discriminating ways over time.
Cliff notices, as did Sam Kome, that readers are now tracked at the page level:
One of the byproducts of this transformation is a major restructuring of ideas and assumptions about reader privacy in light of the availability of information about what is being read, who is reading it, and (a genuinely new development) exactly how it is being read, including the end to frustrating reliance upon purchase, borrowing, or downloading as surrogate indicators for actually reading the work in question. ... one might wish for more than sparse anecdote on the ways and extents to which very detailed data on how a given book is (or is not) read, and by whom, actually benefits the various interested parties: authors, publishers, retailers, platform providers, and even readers.
Cliff points out an important shift in the rhetoric about privacy:
Historically, most of the language has been about competing values and how they should be prioritized and balanced, using charged and emotional phrases: “reader privacy,” “intellectual freedom,” “national security,” “surveillance,” “accountability,” “protecting potential victims” ... These conversations are being supplanted by a sterile and anodyne, value-free discussion of “analytics:” reader analytics, learning analytics, etc. These are presented as tools that smart and responsible modern organizations are expected to employ; indeed, not doing analytics is presented as suggesting some kind of management failure or incompetence in many quarters. The operation of analytics systems, ... tends to shift discussions from whether data should be collected to what we can do with it, and further suggests that if we can do something with it, we should.
Privacy is among the reasons readers have for using ad-blockers; the majority of the bytes they eliminate are not showing you ads but implementing trackers. The Future of Ad Blocking: An Analytical Framework and New Techniques by Grant Storey, Dillon Reisman, Jonathan Mayer and Arvind Narayanan reports on several new ad-blocking technologies, including one based on laws against misleading advertising:
ads must be recognizable by humans due to legal requirements imposed on online advertising. Thus we propose perceptual ad blocking which works radically differently from current ad blockers. It deliberately ignores useful information in markup and limits itself to visually salient information, mimicking how a human user would recognize ads. We use lightweight computer vision techniques to implement such a tool and show that it defeats attempts to obfuscate the presence of ads.
They are optimistic that ad-blockers will win out:
Our second key observation is that even though publishers increasingly deploy scripts to detect and disable ad blocking, ad blockers run at a higher privilege level than such scripts, and hence have the upper hand in this arms race. We borrow ideas from rootkits to build a stealthy adblocker that evades detection. Our approach to hiding the presence and purpose of a browser extension is general and might be of independent interest.
I don't agree. The advent of DRM for the Web requires that the DRM implementation run at a higher privilege level than the ad-blocker, and that it prevent less-privileged code observing the rendered content (less it be copied). It is naive to think that advertisers will not notice and exploit this capability.


As usual, Maciej Cegłowski describes the situation aptly:
We're used to talking about the private and public sector in the real economy, but in the surveillance economy this boundary doesn't exist. Much of the day-to-day work of surveillance is done by telecommunications firms, which have a close relationship with government. The techniques and software of surveillance are freely shared between practitioners on both sides. All of the major players in the surveillance economy cooperate with their own country's intelligence agencies, and are spied on (very effectively) by all the others.
Steven Bellovin, Matt Blaze, Susan Landau and Stephanie Pell have a 101-page review of the problems caused by the legacy model of communication underlying surveillance law in the Harvard Journal of Law and Technology entitled Its Too Complicated: How The Internet Upends Katz, Smith and Electronic Surveillance Law. Its clearly important but I'm only a short way into it, I may have more to say about it later.

And this, of course, assumes that the government abides by the law. Marcy Wheeler disposes of that idea:
All of which is to say that the authority that the government has been pointing to for years to show how great Title VII is is really a dumpster fire of compliance problems.

And still, we know very little about how this authority is used.
and also:
one reason NSA analysts were collecting upstream data is because over three years after DOJ and ODNI had figured out analysts were breaking the rules because they forgot to exclude upstream from their search, they were still doing so. Overseers noted this back in 2013!


The boundaries between government entities such as intelligence agencies and law enforcement and criminals have always been somewhat fluid. The difficulty of attributing activity on the Internet (also here) to specific actors has made them even more fluid:
Who did it? Attribution is fundamental. Human lives and the security of the state may depend on ascribing agency to an agent. In the context of computer network intrusions, attribution is commonly seen as one of the most intractable technical problems, as either solvable or not solvable, and as dependent mainly on the available forensic evidence. But is it? Is this a productive understanding of attribution? — This article argues that attribution is what states make of it.
The most important things to keep private are your passwords and PINs. They're the primary target for the bad guys, who can use them to drain your bank accounts. Dan Goodin at Ars Technica has an example of how incredibly hard it is to keep them secret. In Meet PINLogger, the drive-by exploit that steals smartphone PINs, he reports on Stealing PINs via mobile sensors: actual risk versus user perception by Maryam Mehrnezhad, Ehsan Toreini, Siamak F. Shahandashti and Feng Hao. Goodin writes:
The demonstrated keylogging attacks are most useful at guessing digits in four-digit PINs, with a 74-percent accuracy the first time it's entered and a 94-percent chance of success on the third try. ... The attacks require only that a user open a malicious webpage and enter the characters before closing it. The attack doesn't require the installation of any malicious apps.
Malvertising, using ad servers to deliver malware, is a standard technique for the bad guys, and this attack can use it:
Malicious webpages—or depending on the browser, legitimate sites serving malicious ads or malicious content through HTML-based iframe tags—can mount the attack by using standard JavaScript code that accesses motion and orientation sensors built into virtually all iOS and Android devices. To demonstrate how the attack would work, researchers from Newcastle University in the UK wrote attack code dubbed PINLogger.js. Without any warning or outward sign of what was happening, the JavaScript was able to accurately infer characters being entered into the devices.

"That means whenever you are typing private data on a webpage [with] some advert banners ... the advert provider as part of the page can 'listen in' and find out what you type in that page," ... "Or with some browsers as we found, if you open a page A and then another page B without closing page A (which most people do) page A in the background can listen in on what you type in page B."
The authors are pessimistic about blocking attacks using sensor data:
Access to mobile sensor data via JavaScript is limited to only a few sensors at the moment. This will probably expand in the future, specially with the rapid development of sensor-enabled devices in the Internet of things (IoT). ... Many of the suggested academic solutions either have not been applied by the industry as a practical solution, or have failed. Given the results in our user studies, designing a practical solution for this problem does not seem to be straightforward. ... After all, it seems that an extensive study is required towards designing a permission framework which is usable and secure at the same time. Such research is a very important usable security and privacy topic to be explored further in the future.
The point is not to focus on this particular channel, but to observe that it is essentially impossible to enumerate and block all the channels by which private information can leak from any computer connected to the Internet.


Because it is effectively impossible for you to know what privacy risks you are running, you are probably the main violator of your privacy on the Internet, for two main reasons:
  • You have explicitly and implicitly agreed to Terms of Service (and here) that give up your privacy rights in return for access to content. Since the content probably isn't that important to you, your privacy can't be that important either.
  • You have not taken the simple precautions necessary to maintain privacy by being anonymous when using the Web. Techniques such as cookie syncing and browser fingerprinting mean that even using Tor isn't enough. Even though Tor obscures your IP address, if you're using the same browser as you did without Tor or when you logged in to a site, the site will know its you. Fortunately, there is a very simple way to avoid these problems. Tails (The Amnesic Incognito Live System) can be run from a USB flash drive or in a VM. Every time it starts up it is in a clean state. The browser looks the same to a Web site as every other Tails browser. Use it any time privacy is an issue, from watching pr0n to searching for medical information.
It is very sad that the responsibility for maintaining privacy rests on the shoulders of the individual, with essentially no support from the law, but everyone else finds your lack of privacy so useful and profitable that this situation isn't going to change. After all, The Panopticon Is Good For You.


At The Atlantic, Arvind Narayanan and Dillon Reisman's The Thinning Line Between Commercial and Government Surveillance reports:
As part of the Princeton Web Transparency and Accountability Project, we’ve been studying who tracks you online and how they do it. Here’s why we think the fight over browsing histories is vital to civil liberties and to a functioning democracy.

Privacy doesn’t merely benefit individuals; it fundamentally shapes how society functions. It is crucial for marginalized communities and for social movements, such as the fight for marriage equality and other once-stigmatized views. Privacy enables these groups to network, organize, and develop their ideas and platforms before challenging the status quo. But when people know they’re being tracked and surveilled, they change their behavior. This chilling effect hurts our intellectual freedoms and our capacity for social progress.
They stress the effectiveness of the tracking techniques I mentioned above:
Web tracking today is breathtaking in its scope and sophistication. There are hundreds of entities in the business of following you from site to site, and popular websites embed about 50 trackers on average that enable such tracking. We’ve also found that just about every new feature that’s introduced in web browsers gets abused in creative ways to “fingerprint” your computer or mobile device. Even identical looking devices tend to behave in subtly different ways, such as by supporting different sets of fonts. It’s as if each device has its own personality. This means that even if you clear your cookies or log out of a website, your device fingerprint can still give away who you are.
And that, even if used by companies, governments (and ISPs) can piggy-back on them:
Worse, the distinction between commercial tracking and government surveillance is thin and getting thinner. The satirical website The Onion once ran a story with this headline: “CIA's ‘Facebook’ Program Dramatically Cut Agency's Costs.” Reality isn’t far off. The Snowden leaks revealed that the NSA piggybacks on advertising cookies, and in a technical paper we showed that this can be devastatingly effective. Hacks and data breaches of commercial systems have also become a major part of the strategies of nation-state actors.
Ironically, The Atlantic's web-site is adding tracking information to their article's URL (note the 524592):
and to the attributes of the links in it:
At Gizmodo, Kashmir Hill's Uber Doesn’t Want You to See This Document About Its Vast Data Surveillance System is a deep dive into the incredibly detailed information Uber's database maintains about each and every Uber user. It is based on information briefly revealed in a wrongful termination lawsuit, before Uber's lawyers got it sealed.
For two days in October, before Uber convinced the court to seal the material, one of Spangenberg’s filings that was publicly visible online included a spreadsheet listing more than 500 pieces of information that Uber tracks for each of its users. ...

For example, users give Uber access to their location and payment information; Uber then slices and dices that information in myriad ways. The company holds files on the GPS points for the trips you most frequently take; how much you’ve paid for a ride; how you’ve paid for a ride; how much you’ve paid over the past week; when you last canceled a trip; how many times you’ve cancelled in the last five minutes, 10 minutes, 30 minutes, and 300 minutes; how many times you’ve changed your credit card; what email address you signed up with; whether you’ve ever changed your email address.
Both articles are must-reads.


David. said...

Continuing the theme of how difficult it is to protect your privacy, todays contribution is Stealing Windows credentials using Google Chrome by Bosko Stankovic:

"This article describes an attack which can lead to Windows credentials theft, affecting the default configuration of the most popular browser in the world today, Google Chrome, as well as all Windows versions supporting it."

David. said...

I'm a big fan of Maciej Cegłowski's barn-burning speeches. The one he gave at re:publica 2017 on May 10 is absolutely a must-watch for anyone concerned about privacy.

David. said...

Maciej Cegłowski's text is here. Some key quotes:

"Facebook is the dominant social network in Europe, with 349 million monthly active users. Google has something like 94% of market share for search in Germany. The servers of Europe are littered with the bodies of dead and dying social media sites. The few holdouts that still exist, like Xing, are being crushed by their American rivals.

In their online life, Europeans have become completely dependent on companies headquartered in the United States.

And so Trump is in charge in America, and America has all your data. This leaves you in a very exposed position. US residents enjoy some measure of legal protection against the American government. Even if you think our intelligence agencies are evil, they’re a lawful evil. They have to follow laws and procedures, and the people in those agencies take them seriously.

But there are no such protections for non-Americans outside the United States. The NSA would have to go to court to spy on me; they can spy on you anytime they feel like it. ... And now those corporations have to deal with Trump. How hard do you think they’ll work to defend European interests?"


"Part of this concentration is due to network effects, but a lot of it is driven by the problem of security. If you want to work online with any measure of convenience and safety, you must choose a feudal lord who is big enough to protect you."


"Each of the big five companies, with the important exception of Apple, has made aggressive user surveillance central to its business model. This is a dilemma of the feudal internet. We seek protection from these companies because they can offer us security. But their business model is to make us more vulnerable, by getting us to surrender more of the details of our lives to their servers, and to put more faith in the algorithms they train on our observed behavior.

These algorithms work well, and despite attempts to convince us otherwise, it’s clear they work just as well in politics as in commerce. So in our eagerness to find safety online, we’ve given this feudal Internet the power to change our offline world in unanticipated and scary ways."

David. said...

Google now has visibility into your offline credit-card transactions:

"Google has begun using billions of credit-card transaction records to prove that its online ads are prompting people to make purchases – even when they happen offline in brick-and-mortar stores, the company said Tuesday.

The advance allows Google to determine how many sales have been generated by digital ad campaigns, a goal that industry insiders have long described as “the holy grail” of online advertising. But the announcement also renewed long-standing privacy complaints about how the company uses personal information.

To power its multibillion-dollar advertising juggernaut, Google already analyzes users’ Web browsing, search history and geographic locations, using data from popular Google-owned apps like YouTube, Gmail, Google Maps and the Google Play store. All that information is tied to the real identities of users when they log into Google’s services.

The new credit-card data enables the tech giant to connect these digital trails to real-world purchase records in a far more extensive way than was possible before."

David. said...

A newly discovered ad-blocker-aware malvertising campaign called RoughTed is in the wild:

"Traffic comes from thousands of publishers, some ranked in Alexa's top 500 websites. Contaminated domains accumulated over half a billion visits in the past three months alone, according to security firm Malwarebytes."

Details from Malwarebytes here.

David. said...

See also s Privacy Still a Big Deal Today? by Kartik Hosanagar and Tai Bendit from the Wharton School:

"To address the [ad-blocker] issue, the Interactive Advertising Bureau (IAB), an advertising industry organization, has proposed the LEAN advertising program.

LEAN — an acronym for light, encrypted, ad choice supported, non-invasive ads — suggests a number of guidelines aimed at protecting user privacy and improving their overall experience with interactive ads. Among the guidelines is the expectation of compliance with the Digital Advertising Alliance’s consumer privacy program. It’s too soon to tell whether LEAN will be widely adopted, but the initiative shows how consumers can take control and get the industry to take action."


David. said...

At The Register, Thomas Calburn's In detail: How we are all pushed, filed, stamped, indexed, briefed, debriefed or numbered – by online biz all day discusses a report from Cracked Labs on surveillance capitalism:

"Among poker players, it's commonly understood that if you look around the table and you don't see the sucker, it's you. The situation is the same in the knowledge economy because you can't see anything. There's almost no transparency into how data gets bought, sold, and used to make decisions that affect people's lives. When people make decisions about the information they share, they seldom understand how that data will be used or how it might affect them."

David. said...

At The Register, John Leyden's Banking websites are 'littered with trackers' ogling your credit risk discusses a report from eBlocker:

"A new study has warned that third-party trackers litter banking websites and the privacy-invading tech is being used to rate surfers' creditworthiness.

Among the top 10 financial institution websites visited in the US and UK, there are 110 third-party trackers snooping on surfers each time they visit."

David. said...

Cory Doctorow at Boing Boing points me to Paul Farrell's The Medicare machine: patient details of 'any Australian' for sale on darknet:

"The price for purchasing an Australian’s Medicare card details is 0.0089 bitcoin, which is equivalent to US$22.

Guardian Australia has verified that the seller is making legitimate Medicare details of Australians available by requesting the data of a Guardian staff member.

The darknet vendor says they are “exploiting a vulnerability which has a much more solid foundation which means not only will it be a lot faster and easier for myself, but it will be here to stay. I hope, lol.”

The listing continues: “Purchase this listing and leave the first and last name, and DOB of any Australian citizen, and you will receive their Medicare patient details in full.”

The vendor said they would soon create a “mass batch requesting of details”.

The seller is listed as a highly trusted vendor on the site and has received dozens of positive sale reviews."

David. said...

Note that what is for sale in Australia is not a patient medical record just the details on the Medicare card which enable identity theft.