Friday, March 3, 2017

The Amnesiac Civilization: Part 1

Those who cannot remember the past are condemned to repeat it
George Santayana: Life of Reason, Reason in Common Sense (1905)
Who controls the past controls the future. Who controls the present controls the past.
George Orwell: Nineteen Eighty-Four (1949)
Santayana and Orwell correctly perceived that societies in which the past is obscure or malleable are very convenient for ruling elites and very unpleasant for the rest of us. It is at least arguable that the root cause of the recent inconveniences visited upon ruling elites in countries such as the US and the UK was inadequate history management. Too much of the population correctly remembered a time in which GDP, the stock market and bankers' salaries were lower, but their lives were less stressful and more enjoyable.

Two things have become evident over the past couple of decades:
  • The Web is the medium that records our civilization.
  • The Web is becoming increasingly difficult to collect and preserve in order that the future will remember its past correctly.
This is the first in a series of posts on this issue. I start by predicting that the problem is about to get much, much worse. Future posts will look at the technical and business aspects of current and future Web archiving. This post is shorter than usual to focus attention on what I believe is an important message

In a 2014 post entitled The Half-Empty Archive I wrote, almost as a throw-away:
The W3C's mandating of DRM for HTML5 means that the ingest cost for much of the Web's content will become infinite. It simply won't be legal to ingest it.
The link was to a post by Cory Doctorow in which he wrote:
We are Huxleying ourselves into the full Orwell.
He clearly understood some aspects of the problem caused by DRM on the Web:
Everyone in the browser world is convinced that not supporting Netflix will lead to total marginalization, and Netflix demands that computers be designed to keep secrets from, and disobey, their owners (so that you can’t save streams to disk in the clear).
Two recent developments got me thinking about this more deeply, and I realized that neither I nor, I believe, Doctorow comprehended the scale of the looming disaster. It isn't just about video and the security of your browser, important as those are. Here it is in as small a nutshell as I can devise.

Almost all the Web content that encodes our history is supported by one or both of two business models: subscription, or advertising. Currently, neither model works well. Web DRM will be perceived as the answer to both. Subscription content, not just video but newspapers and academic journals, will be DRM-ed to force readers to subscribe. Advertisers will insist that the sites they support DRM their content to prevent readers running ad-blockers. DRM-ed content cannot be archived.

Imagine a world in which archives contain no subscription and no advertiser-supported content of any kind.

Update: the succeeding posts in the series are:


  1. I'll take the other side of this bet.

    The difference is that Hollywood movies have always had DRM while academic works and random Web pages haven't.

  2. Wes, I'm not a betting man. But if I were, you would lose. Up to now, there hasn't been a good way to DRM journals and random Web pages, so the fact that they aren't DRM-ed now doesn't mean that they won't be.

    Academic journals are freaking out about Sci-Hub. Once DRM is available for the Web, academic journals will use it to kill off Sci-Hub.

    Once DRM for the Web is available, sites that sell space to advertising networks will have a huge incentive to DRM their pages. Right now they're losing income all the time from ad blockers. DRM-ing their pages blocks ad blockers.

  3. Dear David,

    While agreeing with you in general, I feel that you over-dramatize the problem a bit.

    First, anything can be archived somehow, especially if the archiving agent is legally authorized to do so. In the absence of such authorization, archiving might be patchy and out of control of official archiving services – but it still will happen.

    Second, there is countermeasure for any technical measure. Any DRM might and will be broken. This “arms race” probably will never stop :) Anything that reaches computer’s screen or dynamics can be captured – by common photo or video camera, if need be.

    With my best wishes to you,

  4. Natasha, experience with the LOCKSS system (which archives Web content with permission) and at the British Library (which archives Web content with authority) shows that these approaches do not scale.

    Neither do approaches based on breaking DRM (because institutions large enough to scale are legally vulnerable) or on using the equivalent of cameras (which also degrade the usefulness of captured content).

    The value of the Web does NOT lie in the individual Web pages. The value of the Web is based on network effects. Approaches that capture a few individual Web pages are not a way of remembering society's past.

  5. The Royal Library of Denmark has a legal mandate to collect software and has the right to remove any copy protection (DRM), so they are legally allowed to "crack" software.

    As it seems legislatures like Denmark will have an advantage here over legislatures like the US, where such an idea appears like heresy. Let's hope Scandinavia won't geo-fence their web sites :)

  6. Cliff Lynch takes a much broader look at these issues in Stewardship in the "Age of Algorithms", a really important essay. Stay tuned for a post about it.