Science has always been “a flawed, profoundly contingent, culturally relative, and all-too-human process.”
A paper published in Science has found that the results of only a third of nearly 100 studies published in top psychology journals could be replicated in repeat experiments. What’s surprising about this is that it has taken this long for the problem to come to light. These results remind us that the scientific method—often idealised as a sleek engine for turning observations into dependable facts and theories—is just an ad hoc collection of procedures dependent on fallible human sensation and reason.
The new study was conducted by 270 scientists working in psychology, social sciences and related areas: a collaboration called the Reproducibility Project, led by psychologist Brian Nosek of the University of Virginia. Nosek heads an institute called the Centre for Open Science in Charlottesville, which is aiming to improve the way science is done—not least, to ensure that what it reports is more reliable. The researchers looked at 98 studies selected from three journals, and conducted experiments to see if the main claims they made about human behaviour stood up to scrutiny, using methods as close as possible to those in the original papers. Whereas 97 per cent of the original papers reported a significant effect of some sort—say, a difference in the way children and adults respond to fear—only 36 per cent of the repeat experiments found the same effect at a statistically significant level.
Those differences could have various explanations. Perhaps the original work was simply badly conducted—the data could have been wrongly analysed, for example. Or the original result might have been genuine but spurious, a “false positive” produced by chance in a group of subjects too small to give reliable findings. Or the two sets of experiments might have differed in ways that went unnoticed (although the repeat experiments generally involved collaboration with the researchers who did the original study). Or the repeat experiment might itself be flawed. Or something else.
What the finding does mean is we have to view psychology literature with caution; on this evidence, over half of what it contains may well be wrong. (The joke doing the rounds, inevitably, is whether this study too will be reproducible.) Some have taken the opportunity to dismiss psychology itself as a pseudo-science—as mere “psychobabble”—but Nosek doubts that its reproducibility problems are unique to the field. Retractions of papers, as problems come to light after publication, are now rather common in many fields, including the “harder” sciences: the life sciences are particularly prone, and one analysis suggested that reproducibility is shockingly low in cancer biology, for example.
The challenges of replication have long been apparent to philosophers of science; it’s just that many scientists have preferred to ignore them. I spoke to Nosek several months ago about the problem of cognitive bias in science: the tendency, which we all share in everyday life, to unconsciously seek out “facts” that confirm our preconceptions. Psychologists (oh the irony) have shown that “most of our reasoning is in fact rationalisation,” Nosek told me. Science tries hard to be more objective, but it’s up against some deeply ingrained habits.
Not all of the problems of “reproducibility” are due to scientists fooling themselves about their data, though. Reproducibility isn’t valued very highly in science—everyone agrees that it’s important, but no one is very keen to repeat a study that someone else has already done, because it gains you no kudos (or funding, in all likelihood), and top journals won’t want to publish the “second” discovery of anything in any case.
But partly, it’s just that science is hard. It’s really tough to get many experiments to work the same way twice. Biologists have long known that the messy, sensitive nature of living things pose problems for replication, and sometimes they will need to get in touch with the folks who did the original research before they can get the same method to work: what’s reported in the paper itself isn’t enough. I’ve spoken to top chemists who doubt that a lot of what gets published in their journals too could be reproduced easily, if at all. Scientists have always known this; now they have to start talking about it.
This doesn’t mean we can’t trust science. But it does mean we mustn’t be naïve about it (or encourage non-scientists to be naïve about it). Figuring stuff out is really tough. Science can be proud that it does so at all. As science historian David Wootton has said in a recent survey of its 17th-century origins, the task of his peers is “to understand how reliable knowledge and scientific progress can and do result from a flawed, profoundly contingent, culturally relative, and all-too-human process.” Scientists themselves would benefit from understanding that, too.