Tuesday, October 8, 2019

The Data Isn't Yours (updated)

Most discussions of Internet privacy, for example Jaron Lanier Fixes the Internet, systematically elide the distinction between "my data" and "data about me". In doing so they systematically exaggerate the value of "my data".

The typical interaction that generates data about an Internet user involves two parties, a client and a server. Both parties know what happened (a link was clicked, a purchase was made, ...). This isn't "my data", it is data shared between the client ("me") and the server. The difference is that the server can aggregate the data from many interactions and, by doing so, create something sufficiently valuable that others will pay for it. The client ("my data") cannot.

Below the fold, an update.

Update 26th October 2019: Lizzie O'Leary's interview with Yael Eisenstat is fascinating for many reasons. Eisenstat is ex-CIA, ex-national security adviser to Vice President Joe Biden, ex-head of Global Elections Integrity Operations for Facebook. The part that is relevant here is where she says:
"But none of the real core issues will be solved before 2020, which is, in my opinion, a business model that exploits human behavioral data in order to sell this idea to advertisers that they can so custom-target individuals, and show us each a different version of truth based on what they have figured out about us. I mean, I know I’m going on a whole thread here, but based on a business model whose entire metric is about user engagement and keeping your eyes on their screen so that they can Hoover up all this data so that they can sell this to advertisers. That is what is rewarding the most salacious content. That is what is rewarding the biggest clickbait stories. That is why—and I assume this is even happening in political advertising—the most salacious content is what’s going to grab the most people’s attention, and their algorithms are all about figuring out how to keep you engaged."
The point is that the data Facebook Hoovers up and sells isn't "my data". It is the record of the user's interaction with Facebook, links clicked, posts liked, images and text uploaded. This is all data about you that is necessarily shared with Facebook. In isolation, the data an individual user shares with Facebook isn't worth much. What makes Facebook so valuable and scary is that they aggregate this kind of data about everyone, even if you never use Facebook directly but just visit sites that work with Facebook. It is the aggregation that is valuable, not the individual data items.

Political advertisers won't come to me to buy my data. They aren't interested in me as an individual. They are interested in buying access to a set of people with specified attributes, which they can only do from an aggregation.

1 comment:

David. said...

I'm shocked, shocked to find surveillance going on here. In I Got Access to My Secret Consumer Score. Now You Can Get Yours, Too. by Kashmir Hill, the New York Times plays Captain Renault about companies collecting "your data" and trading it amongst themselves.

The instructions for getting a copy of what various services have about you is useful. As usual, Eurpoe and California are leading the way:

"Most of the companies only recently started honoring these requests in response to the California Consumer Privacy Act. Set to go into effect in 2020, the law will grant Californians the right to see what data a company holds on them. It follows a 2018 European privacy law, called General Data Protection Regulation, that lets Europeans gain access to and delete their online data. Some companies have decided to honor the laws’ transparency requirements even for those of us who are not lucky enough to live in Europe or the Golden State."