Friday, September 2, 2011

What's Wrong With Research Communication

The recent Dagstuhl workshop on the Future of Research Communication has to produce a report. I was one of those tasked with a section of the report entitled What's Wrong With The Current System?, intended to motivate later sections describing future activities aimed at remedying the problems. Below the fold is an expanded version of my draft of the section, not to be attributed to the workshop or any other participant. Even my original draft was too long; in the process of getting to consensus, it will likely be cut and toned down. I acknowledge a debt to the very valuable report of the UK House of Common's Science & Technology Committee entitled Peer review in scientific publications.


The current research communication system is dysfunctional in many ways. One way to look at these ways is from the perspective of the various participants in the system:

  • The General Public
  • Researchers
  • Libraries, Archives & Repositories
  • Publishers
  • Software & Infrastructure Developers
The General Public

The general public needs to be able to extract reliable information from the deluge of mostly ill-informed, self-serving or commercial messages that forms their information environment. They have been educated to believe that content branded "peer-reviewed" is a gold standard on which they can rely. It would be in the public interest if it were reliable but high-profile examples show this isn’t always the case. For example, it took 12 years before the notorious Wakefield paper linking MMR vaccine to autism was retracted, and another 711 months before the full history of the fraud was revealed. The delay had serious effects on public health; UK MMR uptake went from about 90% to about 65%.

The additional quality denoted by the "peer-reviewed" brand has been decreasing:
“False positives and exaggerated results in peer-reviewed scientific studies have reached epidemic proportions in recent years.”
One major cause has been that the advent of the Internet, by reducing the cost of distribution, encouraged publishers to switch libraries from subscribing to individual journals to the "big deal", in which they paid a single subscription to access all of a publisher's content. In the world of the big deal, many publishers discovered the effectiveness of this Microsoft-like "bundling" strategy. By proliferating cheap, low-quality journals, thus inflating the perceived value of their deal to the librarians, they could grab more of the market. This intuitive conclusion is supported by detailed economic analysis of the "big deal":
"Economists are familiar with the idea that a monopoly seller can increase its profits by bundling. This possibility was discussed by W.J. Adams and Janet Yellen and by Richard Schamalensee. Hal Varian noted that academic journals are well suited for this kind of bundling. Mark Armstrong and Yannis Bakos and Erik Brynjolfsson demonstrated that bundling large collections of information goods such as scholarly articles will not only increase a monopolist's profits, but will also decrease net benefits to consumers."
Researchers cooperated with the proliferation of journals. They were seduced by extra opportunities to publish and extra editorial board slots. They did not see the costs, which were paid by their librarians or funding agencies. The big deal deprived librarians of their economic ability to reward high quality journals and punish low quality journals:
“Libraries find the majority of their budgets are taken up by a few large publishers,” says David Hoole, director of brand marketing and institutional relations at [Nature Publishing Group]. “There is [therefore] little opportunity [for libraries] to make collection decisions on a title-by-title basis, taking into account value-for-money and usage.”
The inevitable result of stretching the "peer-reviewed" brand in this way has been to devalue it. Almost anything, even commercial or ideological messages, can be published under the brand:
“BIO-Complexity is a peer-reviewed scientific journal with a unique goal. It aims to be the leading forum for testing the scientific merit of the claim that intelligent design (ID) is a credible explanation for life.”
Authors submit papers repeatedly, descending the quality hierarchy to find a channel with lax enough reviewing to accept them. PLoS ONE publishes every submission that meets its criteria for technical soundness; despite this 40% of the submissions it rejects as unsound are eventually published elsewhere. Even nonsense can be published if page charges are paid.

Researchers

Researchers play many roles in the flow of research communication, as authors, reviewers, readers, reproducers and re-users. The evaluations that determine their career success are, in most cases, based on their role as authors of papers (for scientists) or books and monographs (for humanists), to the exclusion of their other roles. In the sciences, credit is typically based on the number of papers and the "impact factor" of the journal in which they appeared. Journal impact factor is generally agreed to be a seriously flawed measure of the quality of the research described by a paper. Although impact factors are based on citation counts for their articles, journal impact factors do not predict article citation counts, which are in any case easily manipulated. For example, a citation pointing out that an article had been retracted acts to improve the impact factor of the journal that retracted it. A further problem in some areas of science is that experiments require large teams, and thus long lists of authors, making it hard to assign credit to individuals on the basis of their partial authorship.

Peer review depends on reviewers, who are only very indirectly rewarded for their essential efforts. The anonymity of reviews makes it impossible to build a public reputation as a high-quality reviewer. If articles had single authors and averaged three reviewers, authors would need to do an average of three reviews per submission. Multiple authorship reduces this load, so if were evenly distributed it would be manageable. In practice, the distribution is heavily skewed, loading some reviewers enough to interfere with their research:
“Academic stars are unlikely to be available for reviewing; hearsay suggests that sometimes professors ask their assistants or PhD students to do reviews which they sign! Academics low down in the pecking order may not be asked to review. Most reviews are done by academics in the middle range of reputation and specifically by those known to editors and who have a record of punctuality and rigour in their reviews: the willing and conscientious horses are asked over and over again by overworked and—sometimes desperate—editors.”
The cost is significant:
“In 2008, a Research Information Network report estimated that the unpaid non-cash costs of peer review, undertaken in the main by academics, is £1.9 billion globally each year.”
Reviewers rarely have access to the raw data and enough information on methods and procedures to be able to reproduce the results, even if they had adequate time and resources to do so. Lack of credit for thorough reviews means there is little motivation to do so. Reviewers are thus in a poor position to detect falsification or fabrication. Experimental evidence suggests that they aren’t even in a good position to detect significant errors:
“Indeed, an abundance of data from a range of journals suggests peer review does little to improve papers. In one 1998 experiment designed to test what peer review uncovers, researchers intentionally introduced eight errors into a research paper. More than 200 reviewers identified an average of only two errors. That same year, a paper in the Annals of Emergency Medicine showed that reviewers couldn't spot two-thirds of the major errors in a fake manuscript. In July 2005, an article in JAMA showed that among recent clinical research articles published in major journals, 16% of the reports showing an intervention was effective were contradicted by later findings, suggesting reviewers may have missed major flaws.”
Peer review is often said to be the gold standard of science, but this is not the case. The gold standard in experimental science is reproducibility, ensuring that anyone repeating the experiment gets the same result. When even a New York Times op-ed points out that, in practice, scientists almost never reproduce published experiments it is clear that there is a serious problem. Articles in high-impact journals are regularly retracted; there is even a blog tracking retractions. Lower-impact journals retract articles less frequently, but this probably reflects the lesser scrutiny that their articles receive rather than a lower rate of error. These retractions are rarely based on attempts to reproduce the experiments in question. Researchers are not rewarded for reproducing previous experiments, causing a retraction does not normally count as a publication, and it can be impossible to publish refutations:
“Three teams of scientists promptly tried to replicate his results. All three teams failed. One of the teams wrote up its results and submitted them to [the original journal]. The team's submission was rejected — but not because the results were flawed. As the journal’s editor, [explained] the journal has a longstanding policy of not publishing replication studies.“This policy is not new and is not unique to this journal,” he said. As a result, the original study stands.”
The lack of recognition for reproducing experiments is the least of the barriers to reproducibility. Publications only rarely contain all the information an independent researcher would need in order to reproduce the experiment in question:
“The article summarises the experiment ... - the data are often missing or so emasculated as to be useless. It is the film review without access to the film.”
Quite apart from the difficulty of reproducing the experiment, this frequently prevents other researchers from re-using the techniques in future, related experiments. Isaac Newton famously "stood on the shoulders of giants"; it is becoming harder and harder for today's researchers to stand on their predecessors' shoulders.

Re-using the data that forms the basis of a research communication is harder than it should be. Even in the rare cases when the data is part of the research communication it typically forms "supplementary material", whose format, and preservation are inadequate. In other cases the data are in a separate data repository, tenuously linked to the research communication. Data submissions are only patchily starting to be cite-able via DOIs.

Scholars have been complaining of information overload for more than a century. Online access provides much better discovery and aggregation tools, but these tools struggle against the fragmentation of research communication caused by the rapid proliferation of increasingly specialized and overlapping journals with decreasing quality of reviewing.

Libraries, Archives & Repositories

Libraries used to play an essential role in research communication. They purchased and maintained local collections of journals, monographs and books, reducing the latency and cost of access to research communications for researchers in the short term. As a side effect of doing so, they safeguarded access for scholars in the long term. A large number of identical copies in independently managed collections provided a robust preservation infrastructure for the scholarly record.

The transition to the Web as the medium for scholarly communication has ended the role for local library collections in the access path to the flow of research communication in the short term. In many countries, such as the US, libraries (sometimes in consortia) retain their role as the paying customers of the publishers. In other countries, such as the UK, negotiations as to the terms of access and payment for it are now undertaken at a national level. But neither provides librarians much ability to be discriminating customers of individual journals, because both are subject to the "big deal". Libraries bought into the big deal despite warnings from a few perceptive librarians who saw the threat:
"Academic library directors should not sign on to the Big Deal or any comprehensive licensing agreement with commercial publishers ... the Big Deal serves only the Big Publishers ... increasing our dependence on publishers who have already shown their determination to monopolize the marketplace"
Libraries and archives have been forced to switch from purchasing a copy of the research communications of interest to their readers, to leasing access to the publisher's copy. Librarians did not find publishers’ promises of “perpetual access” to the subscribed materials convincing as a replacement for libraries’ role as long-term stewards of the record. Two approaches to this problem of long-term access have emerged:
  • A single third-party subscription archive called Portico. Portico collects and preserves a copy of published material. Libraries subscribe to Portico and, as long as their subscription continues, can have access to material they used to but no longer subscribe to. Portico has been quite widely adopted, despite not actually implementing a solution to the problem of post-cancellation access (logically, it is a second instance of the same problem), but has yet to achieve economic sustainability.
  • A distributed network of local library collections called LOCKSS (Lots Of Copies Keep Stuff Safe), modeled on the way libraries work in the paper world. Publishers grant permission for LOCKSS boxes at subscribing libraries to collect and preserve a copy of the content to which they subscribe. Fewer libraries are using the LOCKSS system to build collections than subscribe to Portico for post-cancellation access. Despite this the LOCKSS program has been financially sustainable since 2007.
Bereft of almost all their role in the paper world, libraries are being encouraged to both compete in the electronic publishing market and take on the task of running "institutional repositories", in effect publishing their scholars data and research communications. Both tasks are important; neither has an attractive business model. Re-publishing an open access version of their scholars' output may seem redundant, but it is essential if the artificial barriers intellectual property restrictions have erected to data-mining and other forms of automated processing are to be overcome.

In many fields the volumes of data to be published, and thus the costs of doing so, are formidable:
“Adequate and sustained funding for long-lived data collections ... remains a vexing problem ... the widely decentralized and nonstandard mechanisms for generating data ... make this problem an order of magnitude more difficult than than our experiences to date ...”
In many cases, these vast collections of data are the output of scholars at many institutions, the motivation for an individual institution to expend the resources needed for publishing is weak. The business models for subject repositories are fragile; the UK's Arts and Humanities Data Service failed when its central funding was withdrawn, and arXiv.org's finances are shaky at best. A “Blue Ribbon Task Force” recently addressed the issue of sustainable funding for long-term access to data; its conclusions were not encouraging.

Publishers

Academic publishing is a multi-billion dollar business. For at least some of the large publishers, both for-profit and not-for-profit, it is currently extraordinarily lucrative:
  • Reed Elsevier's academic and medical division's 2010 results show revenues of $3160M and pre-tax profits of $1130M, which is a gross margin of 36%. The parent company's tax rate was 24%. Assuming that this applies uniformly across divisions, the Elsevier division made $868M net profit. In other words, 27.5 cents of every dollar in subscription payments went directly to Reed Elsevier's shareholders.
  • Wiley is smaller but just as lucrative. Their 2010 results show their academic publishing division had revenues of $987M and pre-tax profit of $405M, a gross margin of 41%. The parent company's tax rate is 31%. On the same assumption the net profit is $280200M; 2820 cents of every dollar of subscription goes directly to Wiley's shareholders (See note at end).
  • Springer's 2008 results (the most recent available) are harder to interpret but by my computation the corresponding numbers are revenue $949M, pre-tax profit $361M, gross margin 38%, tax rate 11%, net profit $328M. About 34 cents of their subscription dollar flows to shareholders.
  • The American Chemical Society is not for profit, but so lucrative that working chemists are annoyed. It had 2009 revenues of $460M, and paid its executives lavishly, at least compared to the salaries of chemists.
Despite this cornucopia of cash, big publishers search for additional revenue has become extreme:
“At the beginning of 2011, researchers in Bangladesh, one of the world’s poorest countries, received a letter announcing that four big publishers would no longer be allowing free access to their 2500 journals through the Health InterNetwork for Access to Research Initiative (HINARI) system. It emerged later that other countries are also affected.”
The world's research and education budgets pay these three companies about $3.2B/yr for management, editorial and distribution services. Over and above that, the worlds research and education budgets pay the shareholders of these three companies almost $1.5B for the privilege of reading the results of research (and writing and reviewing) that these budgets already paid for.

The over three billion dollars a year might be justified if the big publisher's journals were of higher quality than those of competing not-for-profit publishers, but:
"Surveys of [individual] journal [subscription] pricing ... show that the average price per page charged by commercial publishers is several times higher than that which is charged by professional societies and university presses. These price differences do not reflect differences in quality. If we use citation counts as a measure of journal quality ... we see that the prices charged per citation differ by an even greater margin."
It is hard to justify the one and a half billion dollars a year on any basis. These numbers demonstrate that the three big publishers have effective monopoly power in their market:
"... despite the absence of obvious legal barriers to entry by new competing journals. Bergstrom argues that journals achieve monopoly power as the outcome of a “coordination game” in which the most capable authors and referees are attracted to journals with established reputations. This market power is sustained by copyright law, which restricts competitors from selling “perfect substitutes” for existing journals by publishing exactly the same articles. In contrast, sellers of shoes or houses are not restrained from producing nearly identical copies of their competitors' products."
Publishers' major customers, libraries, are facing massive budget cuts thus are unlikely to be a major source of additional revenue:
"The Elsevier science and medical business ... saw modest growth reflecting a constrained customer budget environment."
The bundling model of the big publishers means that, in response to these cuts, libraries typically cancel their subscriptions to smaller and not-for-profit publishers, so that tough times increase the market dominance of the big publishers.

The business of academic publishing has been slower to encounter, but is not immune from, the disruption the Internet has wrought on other content industries. The combination of cash-strapped customers, publishers desperate for more revenue, and the Internet's effect of greatly reduced costs of publishing, mean that disruption is inevitable. Fighting tooth and nail against this disruption, as the music business did, would be even more counter-productive in this case. The people the publishers would sue are the very people who create and review the content whose monetization the publishers would be defending.

To sum up, the advent of the Internet has greatly reduced the monetary value that can be extracted from academic content. Publishers who have depended on extracting this value face a crisis. The crisis is being delayed only by Universities and research funders. They have the power in their hands to insist on alternative models for access to the results of research, such as self-archiving, but have in most cases been reluctant to do so.

Software & Infrastructure Developers

A large and active movement is developing tools and network services intended to improve the effectiveness of research communication, and thus the productivity of researchers. These efforts are to a great extent hamstrung by two related problems, access to the flow of research communication, and the formats in which research is communicated.

Both problems can be illustrated by the example of mouse genetics. Researchers in the field need a database allowing them to search for experiments that have been performed on specific genes, and their results. The value of this service is such that it has been developed. However, because the format in which these experiments are reported is the traditional journal paper, this and similar databases are maintained by a whole class of scientists, generally post-PH.D. biologists, funded by NIH, who curate genetic and genomic information from published papers into the databases. These expensive people are wasting time they should be spending on research on tasks that could in principle be automated.

Automating this process would require providing software with access to the journal papers, replacing the access the curators get via their institution's journal subscriptions. Unfortunately, these papers are copyright, and the copyright is fragmented among a large number of publishers. The developers of such an automated system would have to negotiate individually with each publisher. If even a single publisher refused to permit access, the value of the automation would be greatly reduced, and volunteers would still be needed.

Equally, because the mechanism that enforces compliance with the current system of research communication attaches value only to publications in traditional formats, vast human and machine efforts are required to extract the factual content of the communication from the traditional format. Were researchers to publish their content in formats better adapted to information technology, these costs could be avoided.

Summary

Generalizing, we can say that improving the current system requires:
  • more information to be published,
  • in formats more suited to information technology,
  • less encumbered with intellectual property restrictions,
  • more cheaply,
  • with better discovery and aggregation mechanisms,
  • better quality metrics,
  • better mechanisms for improving quality,
  • and sustainably preserved for future scholars.
Note: edited 4 Sep 2011 to correct a typo and some links, Wiley's numbers edited 14 Oct 2011 to reflect analysis by Sami Kassab - see this blog post.

11 comments:

David. said...

The problems of the publishers' business models are a hot topic right now, especially in The Guardian. They featured George Monbiot weighing in and now Ben Goldacre commenting on his piece.

Dr. Frances Pinter (YouTube), the founder of Bloomsbury Press, asks how to make library budgets go further in the academic book market? Her suggestion is that a worldwide consortium of libraries should pay for the fixed costs, and get in return an online Creative Commons licensed version. The publisher could sell print and e-books. The idea is interesting, but the problem is that the libraries paying the fixed costs don't actually end up owning anything. Combining this with the LOCKSS system, enabled by the Creative Commons license, would allow the libraries to add these books to their electronic collections.

This mismatch between the marketing power of Elsevier and the librarians is well illustrated by this YouTube video of librarians thanking Elsevier for hosting a party in Petco Park as part of the ALA meeting. They seem to be blissfully ignorant that every penny Elsevier is spending on the party comes straight out of their cash-strapped budgets.

Eric F. Van de Velde said...

Great post DSHR!
I also recently started blogging on this topic. The "investment" in site licenses is foolish and universities should end it. I am covering my perspective in a series of blogs at http://scitechsociety.blogspot.com
Currently, four blogs on this topic, working on more.
--Eric.

David. said...

The Economist reports on major misconduct in cancer research at Duke that was enabled by inadequate peer review. Despite both internal and external reviews, the problem was only detected by persistent and time-consuming attempts at replication at two other institutions. Just how time-consuming is revealing:

"Dr McShane estimates she spent 300-400 hours reviewing the Duke work, while Drs Baggerly and Coombes estimate they have spent nearly 2,000 hours."

The replicators faced difficulties both in performing and publishing their work:

"He noted that in addition to a lack of unfettered access to the computer code and consistent raw data on which the work was based, journals that had readily published Dr Potti’s papers were reluctant to publish his letters critical of the work. Nature Medicine published one letter, with a rebuttal from the team at Duke, but rejected further comments when problems continued."

In the end, failure by at least two teams to replicate the results was not enough to discredit the work. That had to wait for Cancer Letter to notice obvious lies in documents and grant applications. One of the replicators commented:

"I find it ironic that we have been yelling for three years about the science, which has the potential to be very damaging to patients, but that was not what has started things rolling."

Even had they had the resources needed to characterize the problems, the internal and external review committees also faced other difficulties:

"the internal committees responsible for protecting patients and overseeing clinical trials lacked the expertise to review the complex, statistics-heavy methods and data produced by experiments involving gene expression."

"The review committee, however, had access only to material supplied by the researchers themselves, and was not presented with either the NCI’s exact concerns or the problems discovered by the team at the Anderson centre."

A board of the Institute of Medicine is investigating, but some points seem clear. Like the Wakefield case, the misconduct was driven by financial conflicts of interest:

"potential financial conflicts of interest declared by Dr Potti, Dr Nevins and other investigators, including involvement in Expression Analysis Inc and CancerGuide DX, two firms to which the university also had ties."

It was covered up by the fact that the raw data, and details of the methods, were not published. And it was prolonged by the refusal of journals to publish refutations of previous papers.

David. said...

I apologize for not also linking to the study by Diane Harley and Sophia Krzys Acord of Berkeley's Center for the Study of Higher Education entitled PEER REVIEW IN ACADEMIC PROMOTION AND PUBLISHING: ITS MEANING, LOCUS, AND FUTURE. which identifies many of the same problems as the House of Commons report. I believe this report's view of the effectiveness of peer review is somewhat complacent but its coverage of alternative forms of peer review is comprehensive and useful.

David. said...

Those pushing ideological agendas find the "peer-reviewed" brand very useful when applied to articles that match, and very threatening when applied to articles that conflict with their agenda. This is an expression both of its value, which this post for obvious reasons under-plays, and the risks posed by its current state of disrepair if not outright corruption.

Tip of the hat to Slashdot.

David. said...

Researchers at Stanford just made a discovery and the DOE announced it:

"A group of scientists recently sandwiched two non-magnetic materials together and discovered a startling result: The layer where the two materials meet has both magnetic and superconducting regions—two properties that normally can’t co-exist. Technologists have long hoped to find a way to engineer magnetism in this class of materials, called complex oxides, as a first step in developing a potential new form of computing memory for storage and processing."

The immediate response on a mail list I'm on:

"We discovered a material that behaves this way back around 1987, with muon spin rotation, at Brookhaven National Laboratory. We weren't able to get it published in a top=rated journal because reviewers didn't believe the results."

You see these examples all the time.

David. said...

Not exactly about research communication as its generally thought of, but Eric Hellman has a great post on disrupting the business model for textbooks which, if you think about it, is where research communication goes to die. He makes a powerful case for the advantages of his unglued e-books model in the textbook space.

David. said...

Dan Wallach has an interesting and well-argued proposal for transforming publication in Computer Science in Communications of the ACM. His discussion of the problems of the current system and the ways in which his proposal addresses them is particularly valuable, especially appearing in CACM which is a representative of the current system.

David. said...

The BMJ has published more of Brian Deer's research into the fraud behind the Wakefield MMR scare. It implicates his co-authors in the fraud, and the BMJ is calling for an independent review.

Prof. Ingvar Bjarnason and colleagues and Prof Karel Geboes examined the histology grading sheets that were the basis for a Wakefield et al paper that the American Journal of Gastroenterology retracted in May 2010. They found no evidence of disease except constipation.

David. said...

A very relevant analysis of this problem as it applies to neuroscience, and a proposal for improvement, is in a paper by Kravitz & Baker discussed in this comment. Here are the DOI link and the link to the abstract in Frontiers in Computational Neuroscience.

David. said...

Yves Smith links to an article in Nature discussing a paper in Science reporting a survey with over 6000 respondents providing evidence that journal editors are gaming their journal impact factors as discussed above by making self-citations a condition of acceptance.