Expert Voices

Does 'Failure to Replicate' Mean Failed Science? (Op-Ed)

Chemistry scene of test tubes and pipette.
A pipette and test tubes. (Image credit: Anton Prado PHOTO, Shutterstock)

David Funder, a psychology professor at the University of California, Riverside, is president of the Society for Social and Personality Psychology. He contributed this article to LiveScience's Expert Voices: Op-Ed & Insights.

A lot of scientists wear worried frowns these days. Science seems under attack from numerous directions. Some of the attackers are wearily familiar. Peddlers of dogma have been sworn enemies of science since the Dark Ages. People whose political beliefs are challenged by research seek to shut it down. And nobody is much surprised when scientists whose findings threaten the basis of a person's or corporation's wealth find themselves facing well-financed opposition and even personal attacks. Scientists who study astronomy, evolution, discrimination and global warming — to name a few — are used to this situation, and while they surely don't enjoy it, it's really nothing new.

However, scientists now have something else to worry about. The very foundation of science is suddenly being brought into question. The issue concerns "replicability," the assumption that valid scientific studies can be repeated by anybody with the necessary skills and will yield the same results.

In 2005, a distinguished medical researcher wrote an article entitled "Why Most Published Research Findings Are False," and its publication seemed to mark some kind of turning point. In the years since, serious concerns regarding the trustworthiness of research findings have been voiced in major journals and at professional meetings of fields as diverse as medicine, physics, cell biology, economics, and my own field, social psychology. [Oops! 5 Retracted Science Studies]

Across all of those disciplines, the concern has been the same: Findings garnered in one lab, sometimes important and famous findings, have turned out to be difficult, if not impossible, to reproduce anywhere else. When that happens, it's called a "failure to replicate" — a phrase that strikes a chill in the heart of any scientist who hears it.

Why do findings sometimes fail to replicate?  There are many possible reasons. In a few cases — which have become infamous — researchers committed fraud and literally made up their data. One of the most famous instances involved Dutch psychologist Diederik Stapel, the subject of a recent profile in the New York Times, who fraudulently invented data for dozens of studies over a period of years. Other cases of data fraud have been reported recently in oncology, genetics and even dentistry.

But while these egregious cases justly cause widespread alarm, focusing too tightly on them can be misleading. Such fraud is actually rare, and the typical reasons for failures to replicate are different. To list just a few: The replication study might not follow exactly the same methods as the original study, or the new investigators may not have the skills necessary to successfully repeat a complex experimental procedure; the finding in question might have undiscovered "moderator variables," factors that cause the finding to get stronger, or go away; or, the original finding might been a "lucky" accident.

The mechanisms of nature are complicated, sometimes even almost chaotic. Scientists work hard to find signal amidst all that noise, and when they think they find something, are eager to report it to their colleagues and to the world. They also might, in some cases, be a little too eager. After all, research dollars, reputations and careers are all on the line, and it would be surprising indeed if these incentives did not lead scientists —who are as human as anyone else — to do what they can to convince themselves and their colleagues that they have found something important.

For this reason, it is only natural that psychology is leading the way in dealing with replicability issues and developing prescriptions for improvement that are relevant to all areas of science. Special articles or complete special issues with specific recommendations have recently been published by Perspectives on Psychological Science, Psychological Inquiry and the European Journal of Personality. Social psychologist Brian Nosek and his colleagues have initiated an online Open Science Framework to make it easier for researchers to share methods and data. And recently, a task force of the Society for Personality and Social Psychology formulated other recommendations to help improve the conduct and reporting of research, and to reconsider the incentives that affect the behavior of research scientists.

The recommendations are numerous, and some are rather technical (involving, for example, new statistical standards). But the recommendation that might be the most important is also the simplest: Do more research.

Because nature is complicated and reliable findings are difficult to find, we need to examine it using more powerful methods. For astronomy, this might mean a bigger telescope; for microbiology, it might be a stronger microscope. For all fields of science, including psychology, it means simply more data.

Studies need to get bigger. Small studies are useful for trying out new ideas, but only replications can sort genuine discoveries from false starts, and replication studies need to be large to be conclusive. A finding based on 100 rats will be more reliable than a finding based on 10; a treatment outcome that is evaluated with 1,000 patients will be more reliably assessed than one that looks at only 100; and, in general, the bigger the number of research subjects in a study, the more reliable the finding.

But big studies are expensive and time-consuming. The typical scientist works under conditions of scarce resources and intense time pressure, and replication studies are not conducted or reported as often as they should be. Changing this state of affairs will require some behavior change by some scientists — a challenge we in social psychology are eager to tackle — but also more resources. Specific replication studies may be deemed successes or failures while firm conclusions only emerge over time. What matters most is that scientists continue to work hard to determine which exciting preliminary findings stand up under repeated research.

The views expressed are those of the author and do not necessarily reflect the views of the publisher. This article was originally published on LiveScience.com.

Original article published on LiveScience.com.

president