There’s no excuse for shoddy practice of allowing researchers to change outcomes and goals without saying so, says Ben Goldacre.
Science is in flux. The basics of a rigorous scientific method were formulated many years ago, but there is now growing concern about systematic structural flaws that undermine the integrity of published data: selective publication, inadequate description of study methods that hinder replication efforts. Blocking, and unannounced use of multiple analytical strategies through data dredging. Such problems undermine the integrity of published data and increase the risk of exaggerated or false-positive findings, which collectively lead to a ‘replication crisis’.
With academic papers documenting the enormity of these problems, we have seen a rise in ‘technological activism’: groups building data structures and services to help find solutions. These include the Reproducibility Project, which shares the work of replicating hundreds of published papers in psychology, and the Registered Report, in which researchers can specify their methods and analytical strategy before starting a study.
These initiatives can generate conflict, as they set individuals up to account. Most researchers maintain a public posture that science is about healthy, interpersonal, critical evaluation. But when you repeat someone’s methods and find inconsistent results, there is inevitably a risk of friction.
Our team at the Center for Evidence-Based Medicine at the University of Oxford, UK, is now facing the same challenge. We are targeting the problem of selective outcome reporting in clinical trials.
Initially, clinical trial takers are expected to publicly declare what measurements they will take to assess the relative benefits of the treatments being compared. This is a long-standing best practice, as outcomes such as ‘heart health’ can be measured in a number of ways. Researchers are therefore expected to list the specific blood tests and symptom-rating scales they will use, for example, along with the dates on which the measurements will be taken, and any cut-off values that they may use in continuous data. to convert it to a categorical variable.
All this is done to prevent researchers from ‘data-dredging’ their results. If researchers switch from these pre-specified results without stating that they have done so, they break the assumptions of their statistical tests. This carries a significant risk of exaggerating the findings, or simply falsifying them, and this in turn helps to explain why so many test results eventually turn out to be wrong.
You might think that this problem is so obvious that it would already be competently managed by researchers and journals. But that is not the case. Time and again, academic papers have been published showing that outcome-switching is highly prevalent, and such switches often report more favorable statistically significant outcomes instead.
This is despite several codes of conduct established to prevent such switching, notably the widely respected CONSORT guidelines, which require the reporting of all pre-specified results and explanations for any changes. Almost all major medical journals support these guidelines, and yet we know that the unknown result-switching continues.
Auditing and accountability are the bread and butter of good medicine and good science.
In an attempt to fix this problem, our group has come up with a new approach. Since last October, we have been checking results reported in each trial published in five top medical journals against pre-specified results from registry entries or protocols. Most had discrepancies, many of them prominent.
Then, crucially, we submitted a correction letter on each test that reported its results incorrectly, to the journal in question. (All of our raw data, methods and correspondence with journals are available on our website at compare-trials.org.)
We expected journals to take these discrepancies seriously, as test results are used by clinicians, researchers and patients to make informed decisions about treatment. Instead, we have seen a wide range of reactions. Some have demonstrated best practice: for example, the BMJ promptly published a correction on a misreported test within days of our letter being posted.
Other journals have not followed the BMJ’s lead. For example, the editors of the Annals of Internal Medicine have responded to our correction letters with an unsigned rebuttal that, in our view, raises serious questions about their commitment to the management of outcome-switching.