The fundamental problem is that all participants have bad incentives. Follow me below the fold for some recent examples that illustrate their corrupting effects.
Publishers tend to choose reviewers who are prominent and in the mainstream of their subject area. This hands them a powerful mechanism for warding off threats to the subject's conventional wisdom. Ian Leslie's The Sugar Conspiracy is a long and detailed examination of how prominent nutritionists used this and other mechanisms to suppress for four decades the evidence that sugar, not fat, was the cause of obesity. The result was illustrious careers for the senior scientists, wrecked lives for the dissidents, and most importantly a massive, world-wide toll of disease, disability and death. I'm not quoting any of Leslie's article because you have to read the whole of it to understand the disaster that occurred.
At Science Translational Medicine Derek Lowe's From the Far Corner of the Basement has more on this story, with a link to the paper in BMJ that re-evaluated the data from the original, never fully published study:
It’s impossible to know for sure, but it seems likely that Franz and Keys may have ended up regarding this as a failed study, a great deal of time and effort more or less wasted. After all, the results it produced were so screwy: inverse correlation with low cholesterol and mortality? No benefit with vegetable oils? No, there must have been something wrong.Dahlia Lithwick's Pseudoscience in the Witness Box, based on a Washington Post story, describes another long-running disaster based on bogus science. The bad incentives in this case were that the FBI's forensic scientists were motivated to convict rather than exonerate defendants:
This study was launched after the Post reported that flawed forensic hair matches might have led to possibly hundreds of wrongful convictions for rape, murder, and other violent crimes, dating back at least to the 1970s. In 90 percent of the cases reviewed so far, forensic examiners evidently made statements beyond the bounds of proper science. There were no scientifically accepted standards for forensic testing, yet FBI experts routinely and almost unvaryingly testified, according to the Post, “to the near-certainty of ‘matches’ of crime-scene hairs to defendants, backing their claims by citing incomplete or misleading statistics drawn from their case work.”The death toll is much smaller:
"the cases include those of 32 defendants sentenced to death.” Of these defendants, 14 have already been executed or died in prison.Via Dave Farber's IP list and Pascal-Emmanuel Gobry at The Week I find William A. Wilson's Scientific Regress.Wilson starts from the now well-known fact that many published results are neither replicated nor possible to replicate, because the incentives to publish in a form that can be replicated, and to replicate published results, are lacking:
suppose that three groups of researchers are studying a phenomenon, and when all the data are analyzed, one group announces that it has discovered a connection, but the other two find nothing of note. Assuming that all the tests involved have a high statistical power, the lone positive finding is almost certainly the spurious one. However, when it comes time to report these findings, what happens? The teams that found a negative result may not even bother to write up their non-discovery. After all, a report that a fanciful connection probably isn’t true is not the stuff of which scientific prizes, grant money, and tenure decisions are made.
And even if they did write it up, it probably wouldn’t be accepted for publication. Journals are in competition with one another for attention and “impact factor,” and are always more eager to report a new, exciting finding than a killjoy failure to find an association. In fact, both of these effects can be quantified. Since the majority of all investigated hypotheses are false, if positive and negative evidence were written up and accepted for publication in equal proportions, then the majority of articles in scientific journals should report no findings. When tallies are actually made, though, the precise opposite turns out to be true: Nearly every published scientific article reports the presence of an association. There must be massive bias at work.
He points out the ramifications of this problem:
In The Prevalence of Inappropriate Image Duplication in Biomedical Research Publications,
If peer review is good at anything, it appears to be keeping unpopular ideas from being published. Consider the finding of another (yes, another) of these replicability studies, this time from a group of cancer researchers. In addition to reaching the now unsurprising conclusion that only a dismal 11 percent of the preclinical cancer research they examined could be validated after the fact, the authors identified another horrifying pattern: The “bad” papers that failed to replicate were, on average, cited far more often than the papers that did! As the authors put it, “some non-reproducible preclinical papers had spawned an entire field, with hundreds of secondary publications that expanded on elements of the original observation, but did not actually seek to confirm or falsify its fundamental basis.”And, as illustrated by The Sugar Conspiracy, this is a self-perpetuating process:
What they do not mention is that once an entire field has been created—with careers, funding, appointments, and prestige all premised upon an experimental result which was utterly false due either to fraud or to plain bad luck—pointing this fact out is not likely to be very popular. Peer review switches from merely useless to actively harmful. It may be ineffective at keeping papers with analytic or methodological flaws from being published, but it can be deadly effective at suppressing criticism of a dominant research paradigm. Even if a critic is able to get his work published, pointing out that the house you’ve built together is situated over a chasm will not endear him to his colleagues or, more importantly, to his mentors and patrons.Science is supposed to provide a self-correcting mechanism to handle this problem, and The Sugar Conspiracy actually shows that in the end it works, but
even if self-correction does occur and theories move strictly along a lifecycle from less to more accurate, what if the unremitting flood of new, mostly false, results pours in faster? Too fast for the sclerotic, compromised truth-discerning mechanisms of science to operate? The result could be a growing body of true theories completely overwhelmed by an ever-larger thicket of baseless theories, such that the proportion of true scientific beliefs shrinks even while the absolute number of them continues to rise.The four-decade reign of the fat hypothesis shows this problem.
In The Prevalence of Inappropriate Image Duplication in Biomedical Research Publications,
This study attempted to determine the percentage of published papers containing inappropriate image duplication, a specific type of inaccurate data. The images from a total of 20,621 papers in 40 scientific journals from 1995-2014 were visually screened. Overall, 3.8% of published papers contained problematic figures, with at least half exhibiting features suggestive of deliberate manipulation. The prevalence of papers with problematic images rose markedly during the past decade. Additional papers written by authors of papers with problematic images had an increased likelihood of containing problematic images as well. As this analysis focused only on one type of data, it is likely that the actual prevalence of inaccurate data in the published literature is higher. The marked variation in the frequency of problematic images among journals suggest that journal practices, such as pre-publication image screening, influence the quality of the scientific literature.At least this is one instance in which some journals are adding value. But lets look at the set of journal value-adds Marcia McNutt. the editor-in-chief of Science, cites in her editorial attacking Sci-Hub (quotes in italics):
- [Journals] help ensure accuracy, consistency, and clarity in scientific communication. If only. Many years ago, the peer-reviewed research on peer-review showed conclusively that only the most selective journals (such as McNutt's Science) add any detectable value to their articles. And that is before adjusting for the value their higher retraction rate subtracts.
- editors are paid professionals who carefully curate the journal content to bring readers an important and exciting array of discoveries. This is in fact a negative. The drive to publish and hype eye-catching, "sexy" results ahead of the competition is the reason why top journals have a higher rate of retraction. This drive to compete in the bogus "impact factor" metric, which can be easily gamed, leads to many abuses. But more fundamentally, any ranking of journals as opposed to the papers they publish is harmful.
- They make sure that papers are complete and conform to standards of quality, transparency, openness, and integrity. Clearly, if the result is a higher rate of retraction the claim that they conform to these standards is bogus.
- There are layers of effort by copyeditors and proofreaders to check for adherence to standards in scientific usage of terms to prevent confusion. This is a task that can easily be automated, we don't need to pay layers of humans to do it.
- Illustrators create original illustrations, diagrams, and charts to help convey complex messages. Great, the world is paying the publishers many billions of dollars a year for pretty pictures?
- Scientific communicators spread the word to top media outlets so that authors get excellent coverage and readers do not miss important discoveries. And the communicators aren't telling the top media outlets that the "important discoveries" are likely to get retracted in a few years.
- Our news reporters are constantly searching the globe for issues and events of interest to the research and nonscience communities. So these journals are just insanely expensive versions of the New York Times?
- Our agile Internet technology department continually evolves the website, so that authors can submit their manuscripts and readers can access the journals more conveniently. Even if we accept the ease of submission argument, the ease of access argument is demolished by, among others, Justin Peters and John Dupuis. Its obviously bogus; the whole reason people use Sci-Hub is that it provides more convenient access! Also, lets not forget that the "Internet technology department" is spending most of their efforts in the way the other Web media do, monetizing their readers, and contributing to the Web obesity crisis. Eric Hellman's study 16 of the top 20 Research Journals Let Ad Networks Spy on Their Readers gave Science a D because:
10 Trackers. Multiple advertising networks.To be fair, Eric also points out that Sci-Hub uses trackers and Library Genesis sells Google ads too.
a trend publishers themselves started many years ago of stretching the "peer reviewed" brand by proliferating journals. If your role is to act as a gatekeeper for the literature database, you better be good at being a gatekeeper. Opening the gate so wide that anything can get published somewhere is not being a good gatekeeper.The wonderful thing about Elsevier's triggering of the Streisand Effect is that it has compelled even Science to advertise Sci-Hub, and to expose the flimsy justification for the exorbitant profits of the major publishers.