Tuesday, August 2, 2016

Cameron Neylon's "Squaring Circles"

Cameron Neylon's Squaring Circles: The economics and governance of scholarly infrastructures is an expanded version of his excellent talk at the JISC-CNI workshop. Below the fold, some extracts and comments, but you should read the whole thing.

Neylon starts by identifying the three possible models for the sustainability of scholarly infrastructures:
Infrastructures for data, such as repositories, curation systems, aggregators, indexes and standards are public goods. This means that finding sustainable economic models to support them is a challenge. This is due to free-loading, where someone who does not contribute to the support of the infrastructure nonetheless gains the benefit of it. The work of Mancur Olson (1965) suggests there are only three ways to address this for large groups: compulsion (often as some form of taxation) to support the infrastructure; the provision of non-collective (club) goods to those who contribute; or mechanisms that change the effective number of participants in the negotiation.
He points out the issues around club goods:
In the case of digital infrastructures a public good (such as an online article or dataset) can be converted to a club good (made excludable) by placing an authentication barrier around it to restrict access to subscribers (as is the case for online subscription journals and databases). Buchannan and those that further developed his 1965 paper on the economics of clubs have probed how club goods and club size relate (Buchannan, 1965). A core finding is that such sustainable clubs have an equilibrium size that depends on congestion in access to good (the extent to which it is purely non-rivalrous) and the value it provides. With digital resources congestion is low, and the club can therefore grow large. This creates a challenge. Digital resources are not natively excludable, a technical barrier has to be put in place. As the group size rises the likelihood of "leakage" (sharing, or piracy if you prefer) increases. Thus resources are expended on strengthening excludability which leads to both economic and political costs as seen in the Open Access debate.
Clearly, the waste of resources caused by implementing excludability for scholarly communication is immense. It starts with the many billions of dollars of excess rent extracted by the commercial publishers in the form of profit, but it doesn't stop there. To that must be added the resources these publishers spend on marketing, sales, on maintaining their subscription management and enforcement systems, and on anti-piracy measures. All of which decrease the value to society of the underlying public goods.

But that isn't the end of the waste. Because subscriptions are so expensive, the publishers' customers spend large staff resources negotiating with the publishers to try to reduce them. In many countries entire organizations are devoted to this task. And the waste doesn't end when licenses are agreed. As Barbara Fister points out, the customers then spend resources acting as unpaid police enforcing the terms of the licenses.

Neylon elaborates:
Infrastructures, such as repositories for data, articles and code, are very close to the ideal of public goods. Mancur Olson in The Logic of Collective Action (1965) discusses how group size has a profound influence on the provision of public goods, in particular noting that provision is only possible for small groups, or where the public good is a byproduct of the provision of non-public goods that are provided to contributors. Indeed Olson's description of the groups that can and cannot provide public goods maps closely onto scholarly infrastructures. ... The transition from small to large is challenging and "medium" sized infrastructures struggle to survive, moving from grant to grant, and in many cases shifting to a subscription model.
The LOCKSS Program has sustained itself since 2007 partly by using the "Red Hat" model of free, open-source software and paid support, and partly by operating the CLOCKSS Archive under contract to a separate non-profit. Publishers pay for their content to be preserved and libraries pay to support this non-profit. Diversity in business models is important, and not something that Neylon discusses. While these models have sustained the LOCKSS Program, they have limited its ability to scale up to address the whole problem space. Alternate approaches have been equally unable to scale.

Neylon observes that:
Subscription and membership models such as those used for online subscription journals and for some data infrastructures have been our traditional model and are an example of the second approach. These models are breaking down as the technology of the web and the agenda for transparency and open access leads to unbundling, the separate of the different services being provided. This tends to mean commercial suppliers focus on club and private good provision and neglect public good provision. Addressing this will require the development of support models more like taxation. However systems of taxation require a shared - and ideally globally shared - sense of the principles of governance and resource distribution.
Our experience would suggest that although open access (if it uses Creative Commons licenses) significantly reduces the cost of preserving the scholarly literature, it reduces the motivation to subscribe to its preservation even more. This can be seen in the difficulty current archives have in preserving the output of the "long tail" of smaller publishers. To the extent that the open access model succeeds, breakdown is likely.

Neylon writes:
Membership models can work in those cases where there are club goods being created which attract members. Training experiences or access to valued meetings are possible examples. In the wider world this parallels the "Patreon" model where members get exclusive access to some materials, access to a person (or more generally expertise), or a say in setting priorities. Much of this mirrors the roles that Scholarly Societies play or at least could play.
Usenix is an example of a society that has successfully transitioned from charging for publications to running meetings, whose proceedings are available to attendees beforehand and to all afterwards. An annual membership is, in effect, buried in the meeting price.

Neylon argues that:
the focus on sustainability models prior to seeking a set of agreed governance principles is the wrong approach. Rather we need to understand how to navigate from club-like to public-like goods. We need to define the communities that contribute and identify club-like benefits for those contributors. We need interoperable principles of governance and resourcing to provide public-like goods and we should draw on the political economics of taxation to develop this.
One form of governance model exists already. Funding agencies can place conditions on the funds they supply to scholars, and they have increasingly been doing so. The UK government has mandated that all papers submitted for the next REF (Research Excellence Framework) be open access, and this has transformed UK scholars attitude to open access.
The core of the policy is that journal articles and conference proceedings must be available in an open-access form to be eligible for the next REF. In practice, this means that these outputs must be uploaded to an institutional or subject repository.
Imagine that the requirement for the succeeding round were to be that the paper, the data and the software were all to be freely available from a repository maintained by the University. Universities routinely tax their scholars for infrastructure, and if maintaining a repository and forcing scholars to deposit their work were a condition of future funding they would be highly motivated to tax for this purpose. As a result of their unhappiness with author processing charges, the Wellcome Trust recently established in cooperation with Faculty of 1000 their own open access publishing platform. Conditioning their research grants on publication of the results via their platform would be feasible and revolutionary.

Neylon concludes:
First we can make a prediction to be tested: All sustainable scholarly infrastructures providing collective (public-like) goods to the research community will be funded on one of the three identified models (taxation, byproduct, oligopoly) or some combination of them.
Second, we can look at stable long standing infrastructures (Crossref, Protein Data Bank, NCBI, arXiv, SSRN) and note that in most cases governance arrangements are an accident of history and were not explicitly planned. Crises of financial sustainability (or challenges of expansion) for these organisations are often coupled to or lead to a crisis in governance, and in some cases a breakdown of community trust. Changes are therefore often made to governance in response to a specific crisis.

Where there is governance planning it frequently adopts a "best practice" model which looks for successful examples to draw from. It is not often based on "worse case scenario" planning. We suggest that this is a problem. We can learn as much from failures of sustainability and their relationship to governance arrangements as from successes.
The reference to SSRN is interesting given Neylon's earlier post, Canaries in the Elsevier Mine: What to watch for at SSRN, which is about what the history of the governance change at Mendeley when Elsevier purchased it tell us about what to look for at SSRN now that Elsevier has purchased it:
The elements that I have argued were lost after Mendeley was purchased.
  1. Advocacy: SSRN always occupied a quite different space in its disciplinary fields to that of Mendeley, and has never had a strong policy/advocacy stance. Nonetheless look for shifts in policy or narrative that align with the STM Article Sharing Policy or other policy initiatives driven from within Elsevier. Particularly in the light of recent developments with the Cancer Moonshot in the US look for efforts to use SSRN to show that "these disciplines are different to science/medicine".
  2. Data: SSRN doesn't have an API and access to data on usage is moderately restrictive already. One way for SSRN and Elsevier to prove me wrong is to make a covenant on releasing data or even better for them to build a truly open API. In the meantime watch for changes in terms of use on the pages that provides the rankings. When it is updated with the major site refresh that is almost certainly coming check for the tell tale signs of obfuscation to make it hard to scrape. These would be the signs of a gradual effort to lock down the data.
  3. Redirection: This is the big one. SSRN is a working paper repository. That is its central function. In that way it is different to Mendeley where you could always argue that the public access element was a secondary function. Watch very closely for when (not if) links to publisher versions of articles appear. Watch how those are presented and whether there is a move towards removing versions that might (perhaps) relate closely to a publisher version. Ask the question: is the fundamental purpose of this repository changing based on the way it directs the attention of users seeking content?
The last question seems already to have been answered by SSRN's removal of papers on bogus copyright grounds. Of course, both Mendeley and SSRN were for-profit operations. But both started out providing free infrastructure services to the research community. Clearly, doing so as a by-product of commercial operations poses very significant risks.