Monday, March 12, 2012

What Is Peer Review For?

Below the fold is an edited version of a talk I prepared for some discussion of the future of research communication at Stanford, in which I build on my posts on What's Wrong With Research Communication? and What Problems Does Open Access Solve? to argue that the top priority for reform should be improving peer review.

What distinguishes academic publishing from other kinds of publishing on the Web? Peer review. But despite the changes in other aspects of academic publishing there has been little change in peer review. This is probably because to most academics having their work successfully peer reviewed is a life-and-death matter, so changes to the system appear very threatening.

I'm an engineer. Peer reviewed publishing is important to engineers, but it is less important than in other fields. It is nice that the LOCKSS Program is the only digital preservation program to have spawned prize-winning peer reviewed papers, but it is far more important that the LOCKSS system is in production use at libraries world-wide, and that we are the only financially self-sustaining digital preservation program, having been cash flow positive for 5 years with no soft money.

Thus I'm in a better position than most to ask the fundamental question, what is peer review for? It is estimated to consume about $3B a year in unpaid work by academics. This is double the profits of Elsevier, Springer and Wiley combined, so it is likely that, were reviewers to be paid for their work, the entire industry would be economically unsustainable. But what do we get for this enormous expenditure? Here I should acknowledge a debt to the fine work of the UK House of Commons Science and Technology Committee, on which some of the following is based.

The purpose of peer review used to be clear, to prevent bad science being published. That is no longer possible; everything will be published, the only question now is where. For example, PLoS ONE publishes every submission that meets its criteria of technical soundness. Nevertheless, over 40% of the papers it rejects as unsound are published in other peer reviewed journals.

So, is the purpose of peer review to ensure that journals maintain their quality standards, so that where a paper is published indicates its quality? If so, peer review is failing miserably. The graphs in this paper show that the dispersion in article quality, as measured by citation count, among the articles in a journal is so great that the journal conveys no information as to the article quality.

Is the purpose to improve the article's presentation of its science? Why would one imagine that peer scientists would be any better at English or graphic design than the authors?

Or is it to improve the scientific content of the article? If so, once again peer review is failing miserably.  It has repeatedly proven to be incapable of detecting not merely rampant fraud, but even major errors.

Why is the peer review system such poor value for $3B/yr? Clearly, because the entire publishing system is designed to exploit the reviewers. They are not paid, they are not credited, they cannot develop a reputation for excellence, they are not given tools to assist their work, they are not given access to the data, software, materials, experimental setups and so on to be able to check the work. And in many cases they have severe, undeclared conflicts of interest hidden behind reviewer anonymity. Which, in any case, is mostly illusory. Most fields are now so specialized that the pool of reviewers is small enough that reviewer anonymity is ineffective. Most fields now have such good informal communication that blinding papers for review is also ineffective.

There have been experiments in moving from pre-publication to post-publication review. PLoS ONE's restriction of pre-publication review to establishing technical soundness, leaving quality to be determined by post-publication review, has been very successful in one sense, but it has not really transformed the system. Other experiments have mostly been failures. This is likely because of their tentative nature, and lack of a clear vision as to what reviewing is for.

Any attempt to reform the system of research communication should start by answering the question of what reviewing is for. The primary goal of review should be to improve the quality of the research, rather than to improve the quality of the publication. Given that goal, we are faced with the following questions:
  • Who should review?
  • How should they review?
  • What should the reward for reviewing be?
Who? Given the goal of improving the science, it seems clear that anyone with constructive criticisms should be able to provide review. This is inherent in the idea of post-publication review.

How? Just as reviewers' opinions as to the quality of an article vary, so will opinions as to the constructive nature of reviews. It won't be possible to prevent non-constructive comments being published, so the nature of comments will also have to be determined post-publication. This is a problem that many existing web-based communities, such as Reddit, Stack Overflow and Slashdot, have already solved using reputation systems linked to real or persistently pseudonymous identities. Possible reviewing system architectures are:
  • Per-publishing platform systems (c.f. Slashdot comments) each with its own reputation system. Diverse reputation systems are not likely to be effective in enhancing careers.
  • Publisher-independent commenting and reputation systems (c.f. Disqus comments). In practice there would be several competing systems, which would hamper their effectiveness.
  • A general Web annotation system with inter-operable reputation infrastructure allowing multiple reputation systems to exchange data (c.f. A prototype of the annotation part of such a system is available here.
What? Reputation systems are effective in rewarding constructive and discouraging disruptive participation, although they cannot be fool-proof. Linking them to real-world career rewards would make them much more effective. This requires giving up reviewer anonymity, which is in any case less effective than it once was.

The content to be reviewed will be much more mutable and executable than currently (c.f. myexperiment). The advent of HTML5 means the Web is transitioning from a document model to a programming environment reliant on Web services. For post-publication review to be effective in improving the science:
  • Reviewers must have access to the content, the code and the data.
  • The content, the code and the data must evolve in response to the reviews.
  • Reviews that caused evolution must redound to the credit of the reviewers.
Systems built around a static or append-only content model cannot do this. Neither can systems whose pipeline contains single points of delay. Nor can systems whose goal is to restrict review to pre-selected reviewers.

The developers of successful on-line communities such as Reddit all say that the route to success is to give the community the tools they need to create the system that matches their needs, rather than design and build the system you think they need. Supply the infrastructure, let the community define the experience.


John Mark Ockerbloom said...

The best online discussion forums I follow don't bother with any sort of complex, automated reputation system. They simply have good moderators, who set the tone and ground rules, remove disruptive or unproductive comments (or deprecate them in various ways), and help direct the conversation in useful ways. They often are also the ones who write or accept the original postings that are then made available for comments.

There's some overhead involved with this, but nothing that should be overwhelming, Generally speaking, it takes less effort to review papers than to write them, and in turn less effort to screen reviews/comments than to write them. Basically, the moderators correspond to the "editors" of the traditional system, just as commenters correspond to the "reviewers" of the traditional system.

Bjorn Roche said...

Why limit ourselves to pre/post publication? Many recent scientific frauds could have been caught if scientists and companies that funded the research were required to submit their hypotheses and methods before they even conducted the experiments. It is only after seeing the results that bad scientists jump through hoops to adjust interpretations, leave-out or change results to make their failed experiments look better. Obviously, in the face of explicit fraud this won't solve the problem, but it's certainly another tool for suspicious reviewers.

Alex said...

"PLoS ONE publishes every submission that meets its criteria of technical soundness. Nevertheless, over 40% of the papers it rejects as unsound are published in other peer reviewed journals". Where does that 40% number come from? thanks

caseybergman said...

I'd be interested to know what evidence supports the claim that "over 40% of the papers [PLoS ONE] rejects as unsound are published in other peer reviewed journals"

Daniel Mietchen said...

In your last two paragraphs, you describe how the published content is getting more and more dynamic, which also means that there will be some sort of (or at least option for) review of the intermediate steps.

If the publishing of new research findings were then to be embedded into an encyclopedic context, a publication would effectively be just an update to a set of interrelated encyclopedic articles. As demonstrated in this talk (and later in its discussion), that would actually be more easy to review than the 10-page items that we now routinely inject into the publishing system.

I am also working on a platform that would provide such an encyclopedic context to (BOAI-compatible) publishing - a brief sketch of the idea is here.

Mike Taylor said...

"PLoS ONE publishes every submission that meets its criteria of technical soundness. Nevertheless, over 40% of the papers it rejects as unsound are published in other peer reviewed journals."

That is a fascinating statistic! What is its source?

Mike Taylor said...

Sorry, ignore that question. I found it on page 138 of the PDF you referenced.

Irene Hames said...

"Here I should acknowledge a debt to the fine work of the UK House of Commons Science and Technology Committee, on which some of the following is based."

Hi, glad that you found the report useful (declaration: I was the specialist adviser to the inquiry/report). Can I just clarify the following from your post

"... PLoS ONE publishes every submission that meets its criteria of technical soundness. Nevertheless, over 40% of the papers it rejects as unsound are published in other peer reviewed journals."

My understanding is that the 40% rejected referred to aren’t rejected by PLoS ONE because they are all unsound. They don't end up being published in PLoS ONE for a number of reasons. I've copied below the relevant two questions (164 and 165, 23 May 2011 session) from the UK House of Commons Science and Technology Committee report ‘Peer Review in Scientific Publications’ (link in your post above) and the responses from Mark Patterson, who was at that time the Director of Publishing at PLoS.

"Q164 Roger Williams: Thank you very much. These questions are directed very much to Dr Patterson but not solely to him if others want to come in. I think your journal [PLoS ONE] publishes 69% of all submitted articles. Does that mean the other 31% are technically unsound?
Dr Patterson: You are correct that it is about 69%, but that doesn't really mean we reject the other 31%. Some of them are "lost" in the sense that they may be sent back for revision—maybe 5% to 10% are sent back for revision[1]—and the others are rejected, as they should be, on the grounds that they don't satisfy technical requirements. We have done some work to look at the fate of those manuscripts. We did some author research in the last couple of years and we have seen that, in both cases, according to the authors' responses, about 40% of rejected manuscripts have been accepted for publication in another journal. There are probably several reasons for that. One is that some of them will have been rejected by PLoS ONE because they are hypotheses or perspectives and are out of scope, or something like that. We publish original research in PLoS ONE, so that is fair enough. They end up being published somewhere because there are appropriate venues. Other authors may have gone away and chanced their arm at another journal and got through their peer review process.

Q165 Roger Williams: Is that without being refined?
Dr Patterson: That we don't know. They may have been revised. As the PLoS ONE process isn't perfect, another chunk will have been rejected inappropriately. We know there are some such articles. The academic world reviewers tend to get in the mode of peer review but we are doing something different and we have to try to get that message across. So there will be a small batch that is rejected inappropriately.
1 Note by witness: ... and are not resubmitted. This is an important clarification because the vast majority of articles are sent for revision and are ultimately resubmitted."

The evidence as given is slightly ambiguous. Maybe someone from PLoS ONE can step in and provide clarification or updated stats?

David. said...

Irene, thank you. Blogging time is scarce for me right now but I will try to get a clarification from my contacts at PLoS.

While the details may be open to clarification, it seems that the big picture, that the real question is not whether an article will get published, but where it will be published, is correct. And if it is, then the problem of effective research communication is a search engine optimization problem. In other words, the problem is not as in the past to prevent bad research being published, but rather to help both humans and more importantly search engines understand the quality of the published research they are reading.

Binfield said...

There seems to be some misunderstandings and concern about the statement that “about 40% of rejected manuscripts have been accepted for publication in another journal”. The worry, of course, is that papers have been rejected from PLoS ONE as being unpublishable but end up getting published anyway. As the publisher of PLoS ONE, I would like to clarify.

The data referred to originated from our annual survey of PLoS ONE authors. Specifically, in June 2011 we asked those authors whom we had rejected throughout 2010 what the status was of their rejected paper. We had 411 respondents to this question. 183 (44%) of them told us that their paper had been accepted by another journal (and this was a similar percentage to prior years). In that survey we did not go on to ask further questions on this topic. We did not, for example, ask whether it was the identical paper which went on to be published; or whether the authors felt that the initial rejection was fairly made against our criteria.

Thus, of the approximately 31% of authors who submitted to PLoS ONE in 2010 but who had been rejected by the time of the survey, approximately 44% went on to be published elsewhere. i.e. approximately 14% total of our total submissions.

Submissions might be rejected from PLoS ONE for a number of reasons, only one of which might be that they are deemed unpublishable.

For example, they may be rejected because they are not article types that we not publish (e.g. they might be opinion pieces, or review articles); or they might simply be out of scope (e.g. they might be in the humanities or the social sciences). It is entirely reasonable that articles in these 2 categories might end up being published elsewhere.

Another category of rejection includes papers which might need more work to become publishable. There are several reasons this might happen – for example the English language might have been too poor; or additional data analysis might have been required; or additional controls might have been needed to make the conclusions robust etc. It is entirely possible that authors who received a rejection on those grounds chose to do the work required to make the article publishable and then submitted the paper to another title (this would be the natural behaviour for those people as, after all, they had been rejected by PLoS ONE and we did not invite them to revise the work and resubmit it to us).

Another category of papers includes papers which are inappropriately rejected from PLoS ONE. No system is perfect, and it is entirely possible that some of our rejection decisions were not correct. In these situations, again, it is reasonable for that class of paper to then go on to be accepted elsewhere.

Finally, there is of course the category of paper which PLoS ONE deemed unpublishable due to some serious concern about the quality of the study, which was not then revised (or corrected) by the authors, and which went on to be published in another journal. We do not know the proportion of the 44% which this category represents, however we are in the middle of a project to find this out on a representative sample.