Publication bias evaluations are not routinely conducted in clinical oncology systematic reviews


Background: Publication bias (PB) can cause an exaggerated estimate of summary effects in systematic reviews (SR). The extent of PB assessment by SRs within oncology journals remains to be determined.

Methods: This study looked at SRs from high impact factor oncology journals between 2007 and 2015 using a PubMed search. Articles were sorted and coded for PB. An additional assessment of BP from unevaluated SRs was performed using Egger’s regression and the trim-and-fill method.

Findings: Of 182 included SRs, 52 preformed a PB assessment. The most common form of assessment was a funnel plot supplemented by Egger’s regression or Begg’s test (44%, 23/52). PB was a routine finding in these SRs (19%, 10/52). SRs that stated following a reporting guideline frequently failed to do so with regards to assessing PB. The magnitude of effect sizes generally decreased when conducting our independent assessments of PB among SRs in our sample that did not evaluate for it.

Interpretation: Our study shows that there exists an underutilization of PB assessments by SRs in clinical oncology. Additionally the methodological validity of SRs can be increased by adhering to reporting guidelines, and through the search of grey literature and clinical trials registries.

Funding: No external source of funding

Research in context

Evidence before this study: Publication bias has been widely researched with regard to being a threat to the validity of systematic reviews. It is described in depth in the Cochrane handbook and numerous other studies. Onishi et al. found that publication bias was under reported amongst high impact factor journals across all fields of medicine, which raises the question of how it compares to journals within clinical specialties.

Added value of this study: Our study gives valuable insight to the extent of which publication bias is assessed by systematic reviews in clinical oncology. It is also one of the first articles of this sample size to show the importance publication bias has on effect sizes for unevaluated systematic reviews.

Implications of all the available evidence: Our study in combination with previous efforts calls to attention the problem of publication bias in systematic reviews. It also shows deficiencies in searching grey literature and clinical trials registries to discover unpublished studies.


Publication bias (PB) can result when the sample of studies that authors consider for systematic review (SR) is not representative of the population of completed studies, with a likely over-representation of published works with statistically significant outcomes. The consequence is an inaccurate, likely exaggerated, estimate of summary effects. Statistically significant results are found with greater frequency than studies containing null or negative outcomes. This occurs for many reasons, including the fact that such studies are more likely to be published sooner, published in high impact factor journals, published in English, and published at all when compared with studies that report non-significant results (Sterne 2001). However, the use of more sophisticated search methods can minimize the degree of PB (Rothstein 2005) (Hopewell 2007). In the largest PB study to date, 1,106 meta-analyses from the Cochrane Database of Systematic Reviews found evidence of PB despite the high quality of these reviews. PB had decreased in recent years, suggesting greater effectiveness of measures used to reduce bias (Kicinski 2015). Ryder et al. examined the uptake of methods to reduce PB and how these approaches had improved over time. Examples include the use of MEDLINE, Cochrane Library, CINAHL, and checking references, which all increased from 1996 to 2006. There was also a 20% increase in searches for non-English language articles and a 26% increase in the search for unpublished studies from this time period.

Many approaches have been developed to assess PB. The most commonly used methods include funnel plots, Egger’s test, Begg’s test, trim-and-fill method, fail-safe N, and modeling (Parekh-Bhurke 2011). These methods take varying approaches. For instance, evaluating the asymmetry of funnel plots provides a clear visual representation of bias when placing effect estimates on the horizontal axis and measure of study size on the vertical. Egger’s test is used to quantify the asymmetry of the funnel plots. The trim-and-fill method corrects for funnel plot asymmetry by imputing the necessary studies to achieve a symmetrical plot (Duval 2000). The fail-safe N method is used to calculate the number of additional “negative” studies that would render the summary effect to a null value, but is not recommended because the estimate of fail-safe N is highly dependent on the mean intervention effect assumed for the unpublished studies (Iyengar 1988).

Publication bias is a potentially serious threat to the validity of SRs (Rothstein 2007). Inclusion of a PB assessment is an integral part of the SR process and is a requirement of commonly adopted reporting guidelines, such as the Quality of Reporting of Meta-analysis (QUOROM) (Moher 1999), Meta-analysis of Observational Studies in Epidemiology (MOOSE) (Stroup 2000), and Preferred Reporting Items for Systematic reviews and Meta-analysis (PRISMA) (Liberati 2009). However, despite the acceptance of PB as a vital component of the SR process, it is significantly underreported. A 2007 study of 300 SRs revealed that only 23.1% reported an evaluation of PB (Moher 2007). Ryder found that although the number of Cochrane reviews assessing for PB using a funnel plot had increased from 6% in 1996 to 26% in 2006, there was still vast room for improvement. More recently, Onishi and Furukawa conducted a study based on the top ten impact factor journals and found that about 69% did, in fact, examine PB (Onishi 2014). However, this study was based only on articles from high impact factor journals published in 2011 and 2012, and questions remain about the frequency of PB in clinical specialties, like oncology, which may represent different methodological standards.

For this study, we examined oncology because it is a discipline where PB has not been sufficiently studied. We evaluated how often PB was assessed in clinical oncology, what methods were used, how often PB was found from these assessments, whether authors adhered to reporting guidelines requiring assessment of PB, and if search methods were conducted to limit the likelihood of PB. In addition, we evaluated PB for the SRs that did not perform an assessment themselves to examine the frequency with which SRs with PB went unreported.