Thursday, October 15, 2015

A Pulitzer is no guarantee

Bina Venkataraman points me to Adrienne LaFrance's piece Raiders of the Lost Web at The Atlantic. It is based on an account of last month's resurrection of a 34-part, Pulitzer-winning newspaper investigation from 2007 of the aftermath of a 1961 railroad crossing accident in Colorado. It vanished from the Web when The Rocky Mountain News folded and survived only because Kevin Vaughan, the reporter, kept a copy on DVD-ROM.

Doing so likely violated copyright. Even though The Crossing was not an "orphan work":
in 2009, the year the paper went under, Vaughan began asking for permission—from the [Denver Public] library and from E.W. Scripps, the company that owned the Rocky—to resurrect the series. After four years of back and forth, in 2013, the institutions agreed to let Vaughan bring it back to the web.
Four years, plus another two to do the work. Imagine how long it would have taken had the story actually been orphaned. Vaughan also just missed another copyright problem:
With [ex-publisher John] Temple’s help, Vaughan got permission from the designer Roger Black to use Rocky, the defunct newspaper’s proprietary typeface.
This is the orphan font problem that I've been warning about for the last 6 years. There is a problem with the resurrected site:
It also relied heavily on Flash, once-ubiquitous software that is now all but dead. “My role was fixing all of the parts of the website that had broken due to changes in web standards and a change of host,” said [Kevin's son] Sawyer, now a junior studying electrical engineering and computer science. “The coolest part of the website was the extra content associated with the stories... The problem with the website is that all of this content was accessible to the user via Flash.”
It still is. Soon, accessing the "coolest part" of the resurrected site will require a virtual machine with a legacy browser.

There is a problem with the article. It correctly credits the Internet Archive with its major contribution to Web archiving, and analogizes it to the Library of Alexandria. But it fails to mention any of the other Web archives and, unlike Jill Lepore's New Yorker "Cobweb" article, doesn't draw the lesson from the analogy. Because the Library of Alexandria was by far the largest repository of knowledge in its time, its destruction was a catastrophe. The Internet Archive is by far the largest Web archive, but it is uncomfortably close to several major faults. And backing it up seems to be infeasible.


Ed Summers said...

Maybe I didn't read between the lines closely enough, but Lepore's article, as good as it is, doesn't seem to explicitly drawn the lesson from the analogy. Not to toot my own horn (too loudly), but I did.

Nick Krabbenhoeft said...

The Atlantic article says that the Denver Library acquired all the print archives of the Rocky Mountain News, but other coverage of that transfer says the Library took custody of almost all content, except "the IP address and the logo."

That article includes this quote, "Now, the library staff will be busy indexing photos and clips, figuring out what to do with the content from the old website and sifting through computer files filled with photos and video," but so far only the photograph archives are available.

It's reminiscent of the LC Twitter archive. They have the content, but the financial, time, and staffing investment to curate it and make it accessible is enormous.