Defining Reproducibility in Clinical Trials (Challenges and Opportunities)

Michael Kane

Yale University


Lack of reproducibility is currently a big problem in clinical trials.


The most significant obstacles to adoption are social.


Reproducible research provides better trials and new opportunities to produce findings more quickly.

How big a problem is lack of reproducibility?

E. Shanil et al. (2014) put an upper bound on the irreproducibility of the final analysis published from a set of 37 publications at 35%




-"Reanalyses of randomized clinical trial data." Jama

But it's worse than that...

\begin{aligned} \mathbb{P} [ & \text{irreproducible trial}] \\ & \leq 1 - \mathbb{P} [\text{reproducible study design}] * \mathbb{P}[\text{reproducible data analysis}] \\ & = 1-(1-0.35)^2 \\ & = 0.57. \end{aligned}
P[irreproducible trial]1P[reproducible study design]P[reproducible data analysis]=1(10.35)2=0.57.\begin{aligned} \mathbb{P} [ & \text{irreproducible trial}] \\ & \leq 1 - \mathbb{P} [\text{reproducible study design}] * \mathbb{P}[\text{reproducible data analysis}] \\ & = 1-(1-0.35)^2 \\ & = 0.57. \end{aligned}

A trial consists of a study design and data analysis...

and even worse than that...

"We completed an electronic search of MEDLINE from inception to March 9, 2014, to identify all published studies that completed a reanalysis of individual patient data from previously published RCTs addressing the same hypothesis as the original RCT."


"We identified 37 eligible reanalyses in 36 published articles, 5 of which were performed by entirely independent authors."

and even worse than that...

"2 [studies were] based on publicly available data and 2 on data that were provided on request; data availability was unclear for 1."

and even worse than that...

The supplemental material was in pdf.

and even worse than that...

The study did not include the analyses trying to reproduce the results.

What can we draw from the paper?

Authors find a "lower-upper" bound of 35% irreproducibility.


Availability of data is separated from the ability to analyze it.


Clinicians have a very different conception of reproducibility.

Why is reproducibility difficult for statisticians in medicine?

We are often not the PI.


Clinicians don't understand the extra effort needed for reproducibility.


The extra effort is not budgeted for.


Why is reproducibility difficult for clinicians in medicine?

They are often not aware of what we mean by reproducible.


They barely understand what we are doing to begin with.


A fully reproducible analysis may increase liability (Baggerly Coombs 2009).


What do we get from reproducible CT's

- More sophisticated inclusion/exclusion criteria based on similar trials.

- Better prognostic factors (cancer naivety as an example).

- Better contextualization of trial results - trial comparison goes from a scarcity to abundancy problem.

- Better understanding of histological heterogeneity.



By Michael Kane

Loading comments...

More from Michael Kane