What are the benefits of calculating the R-factor?
Perhaps the best way to begin answering this question is by mentioning a
recent study \cite{Benjamin_2017} in which 196 cancer researchers, including 138
experts, were asked to predict whether the reports the reproducibility
project was set to verify, including two studies we have analyzed \cite{Sugahara_2009,Willingham_2012}. The authors concluded that the scientists were poor
forecasters who overestimated the validity of the studies \cite{Benjamin_2017}.
We would like to suggest that the scientists would do much better if
they could see the R-graphs of the studies in question (Fig. \ref{900585}). For
example, knowing that 8 out of 9 studies that tested a claim confirmed
it (Fig. \ref{900585}, middle, year 2015, when the study by Benjamin et al. began)
would not only make the prediction more accurate, if not easy, but would
also raise the question whether the tenth attempt to verify this claim
is justified and, once the replication study found that the claim is
irreproducible, whether this conclusion itself needs an independent
review. Instead, the scientists outside of the narrow field had to rely
on their intuition because the required information was not readily
available. This is the deficiency that calculating the R-factor and
making the results freely available can correct.
The R-factor is relatively easy to calculate, as the process requires no
laboratory equipment, laboratory animals, or reagents, and can be done
by anyone with a general expertise in biomedical research. This
calculation is also much faster than experimental replication: all three
studies (Fig. \ref{900585}) were evaluated during one week by one person.
Since the R-factor uses not one, but all reports that have evaluated a
claim (10 to 18 in the examples we used), one can argue that the
confidence level that the R-factor provides is at least as valid as that
provided by a replication study, unless no reports citing the claim of
interest are available, in which case a replication study is in order.
The R-factor is universal in that it is applicable to any scientific
claim, based on either experimental or theoretical work, and, by
extension, to individual researchers, laboratories, institutions,
countries, or any other group, with no basic constraints on how many
reports produced by these groups can be evaluated. This feature implies
that the R-factor can be calculated for each claim made in a report,
should it make more than one.
Since the R-factor can be anywhere between 0 and 1, it reflects the
realities of experimental science, where a binary scale of right and
wrong is not always applicable, especially at the initial stages of
developing an idea, or when the complexity of the experimental system
calls for time to find the final answer. For example, the R-factor of
1.0 for the claim by Ward et al. can be explained by the fact that the
claim can be verified unambiguously by measuring activity of the IDH
mutants with an established approach. The R-factor of 0.88 for the claim
by Willingham et al. may reflect the debate on whether the mechanisms
underlying the effect of CD47 antibodies are more complex than initially
envisioned (reviewed in \cite{Matlung_2017}).
The R-factor of 0.5 for the claim by Sugahara et al. gives a warning
that the claim might be untrue, which may be a surprise for the reader
who relies on the citation indexes and impact factors, as the article
has been cited 405 times and has been published in Science , a top
journal. However, the R-factor of 0.5 also leaves open the possibility
that the claimed approach is applicable to some systems and suggests
that further testing is needed, which is where the replication
initiatives can be very helpful. The cases like that of Sugahara et al.
and the opportunity to contribute to evaluating them through the
R-factor might invite researchers to report unsuccessful attempts to
rest reported claims, as so called negative results often go unpublished
because they are considered inconsequential.
Because the R-factor relies on experimental reports from experts in the
field, this approach alleviates or bypasses the concerns associated with
replication initiatives \cite{Bissell_2013};
technical expertise or of suitable experimental models in a laboratory
specialized in replicating prior studies. This approach also bypasses
the debate on what it means to replicate a study, as it merely asks
whether the main claim of a study, typically formulated in the title of
the report, is confirmed or not. For example, the ongoing clinical
trials of CD47 antibodies \cite{Matlung_2017} cannot in principle replicate the study by Willingham et al., as
it used mice, but the trials would confirm or refute its main claim.
Finally, the R-factor and the information that comes with it (Fig.
\ref{900585},
Data) allows a researcher to focus on the articles that tested the
claim, the opportunity that can be especially valuable for the highly
cited reports, as the majority of these citations (97.7%, 92%, and
97.5% for cases 1-3) merely mention the cited report without evaluating
it experimentally. As previous studies have illustrated
\cite{Greenberg_2009}, the sheer number of
mentioning citations aided by their skillful use can make a field accept
a dubious claim as a fact.