Saturday, October 16, 2010

The Future of the Federal Depository Libraries

Governments through the ages have signally failed to resist the temptation to rewrite history. The redoubtable Emptywheel pointed me to excellent investigative reporting by ProPublica's Dafna Linzer which reveals a disturbing current example of rewriting history.

By being quick to notice, and take a copy of, a new entry in an on-line court docket Linzer was able to reveal that the Obama administration, in the name of national security, forced a Federal judge to create and enter into the court's record a misleading replacement for an opinion he had earlier issued. ProPublica's comparison of the original opinion which Linzer copied and the later replacement reveals that in reality the government wanted to hide the fact that their case against an alleged terrorist was extraordinarily thin, and based primarily on statements from detainees who had been driven mad or committed suicide as a result of their interrogation. Although the fake opinion comes to the same conclusion as the real one, the arguments are significantly different. Judges in other cases could be misled into relying on witnesses this judge discredited for reasons that were subsequently removed.

Linzer's expose of government tampering with a court docket is an example of the problem on which the LOCKSS Program has been working for more than a decade, how to make the digital record resistant to tampering and other threats. The only reason this case was detected was because Linzer created and kept a copy of the information the government published, and this copy was not under their control. Maintaining copies under multiple independent administrations (i.e. not all under control of the original publisher) is a fundamental requirement for any scheme that can recover from tampering (and in practice from many other threats). Techniques such as those developed by Stuart Haber can detect tampering without keeping a copy, but cannot recover from it.

In the paper world the proliferation of copies in the Federal Depository Library Program made the system somewhat tamper-resistant. A debate has been underway for some time as to the future of the FDLP in the electronic world - the Clinton and Bush administrations were hostile to copies not under their control but the Obama administration may be more open to the idea. The critical point that this debate has reached is illuminated by an important blog post by James Jacobs. He points out that the situation since 1993 has been tamper-friendly:
But, since 1993, when The Government Printing Office Electronic Information Access Enhancement Act (Public Law 103-40) was passed, GPO has arrogated to itself the role of permanent preservation of government information and essentially prevented FDLP libraries from undertaking that role by refusing to deposit digital materials with depository libraries.
On the other hand, GPO does now make their documents available for bulk download and efforts are under way to capture them.

The occasion for Jacobs' blog post is that GPO has contracted with Ithaka S+R to produce a report on the future of the FDLP. The fact that GPO is willing to revisit this issue is a tribute to the efforts of government document librarians, but there are a number of reasons for concern that this report's conclusions will have been pre-determined:
  • Like Portico and JSTOR, Ithaka S+R is a subsidiary of ITHAKA. It is thus in the business of replacing libraries with their own collections, such as the paper FDLP libraries, with libraries which act instead as a distribution channel for ITHAKA's collections. Jacobs points out:
    You might call this the "libraries without collections" or the "librarians without libraries" model. This is the model designed by GPO in 1993. It is the model that ITHAKA, the parent organization of Ithaka S+R, has used as its own business model for Portico and JSTOR. This model is favored by the Association of Research Libraries, by many library administrators who apparently believe that it would be better if someone else took the responsibility of preserving government information and ensuring its long-term accessibility and usability, and by many depository librarians who do not have the support of their institutions to build and manage digital collections.
  • Ithaka S+R is already on record as proposing a model for the FDLP which includes GPO and Portico, but not the FDLP libraries. Jacobs:
    Ithaka S+R has already written a report with a model for the FDLP (Documents for a Digital Democracy: A Model for the Federal Depository Library Program in the 21st Century). In that report, it recommended that "GPO should develop formal partnerships with a small number of dedicated preservation entities -- such as organizations like HathiTrust or Portico or individual libraries -- to preserve a copy of its materials".
  • As Jacobs points out, the FDLP libraries are devoted to free, open access to their collections. By contrast GPO is allowed to charge access fees. Charging fees for access is the basis for ITHAKA's business models.
    Where private sector companies limit access to those who pay and GPO is specifically authorized in the 1993 law to "charge reasonable fees," FDLP libraries are dedicated to providing information without charging.
  • The process by which Ithaka S+R ended up with the contract is unclear to me. Were there other bidders? If so, was their position on the future of FDLP on the record as Ithaka S+R's was? If Ithaka S+R was the only bidder, why was this?
It is important to note that although a system for preserving government documents consisting of the GPO and "formal partnerships with a small number of dedicated preservation entities" might well improve the resistance of government documents to some threats, it provides much less resistance to government tampering than the massively distributed paper FDLP. The "small number of dedicated preservation entities" dependent on "formal partnerships" with the government in the form of the GPO will be in a poor position to resist government arm-twisting aimed at suppressing or tampering with embarrassing information.


David. said...

Yesterday, the ARL came out with a set of principles for the future of FDLP that, unsurprisingly, echo the report they paid Ithaka S+R to write. In particular:

"Federal Depository Libraries are not required by law to provide long-term storage for digital Federal documents. GPO should identify and have certified one or more trusted third party repositories that are not part of the Federal government for preservation of and, when necessary, access to digital Federal documents."

It seems the major research libraries don't want to be bothered with collections any more and are happy to trust governments to refrain from tampering with the record. No prizes for guessing which "one ... trusted third party repositor[y]" they are talking about.

David. said...

Of course, it isn't only governments that succumb to the temptation to tamper with the record when it is in their custody. Barry Ritholtz points to another possible example of disappearing material from the record when it conflicts with the current party line.

In this case a December 2000 Wall St. Journal article that casts an unflattering light on a judge the WSJ is currently defending can no longer be found in Dow Jones' Factiva database.