Methodological Quality and Risk of Bias Inconsistently Evaluated and Addressed in Meta-Analyses and Systematic Reviews in Major Oncology Journals

Abstract and Key Words


This study aimed to the reporting and utilization of methodological quality measures in addressing low quality and risk of bias in major oncology journals.


We performed a search of systematic reviews from high impact factor journals in oncology from 2007 to 2015 through PubMed. Covidence was used to screen articles based on the title and abstract. The methodological quality and reporting of risk of bias were evaluated by three rounds of coding from two independent reviewers using the same checklist. Differences in assessment were resolved through group consensus.


Quality assessment was examined in 182 articles after exclusion. Quality or risk of bias assessment was assessed in 48% of articles. More common were tools adapted from authors’custom sources (23%), others (14%), and the Cochrane Risk of Bias Tool (13%). Low quality or high risk of bias studies was detected in 40 studies. Subgroup analysis was conducted in 14%, meta-regression in 10%, and sensitivity analysis in 21%. Low quality or risk of bias were not reported in 32 studies. Quality measures were articulated in narrative format (44%), not at all (44%), or in a combination of tables and figures (12%) .


Quality and risk of bias were assessed in only half of systematic reviews; moreover, when addressed, the methods of assessment were more commonly determined by the authors rather than following recommended guidelines. This analysis provides further evidence for inconsistent quality measure reporting for clinical findings in oncology manuscripts. Differences between bias assessment and quality reporting could misrepresent intervention results in oncology journals.


meta-analysis;oncology;quality; risk of bias;systematic review


The use of systematic reviews and meta-analyses has become increasingly important in evidence-based medicine as clinicians seek reliable information on treatments and care guidelines in their medical practice (Ebell 2004). Since systematic reviews synthesize evidence from multiple studies, clinicians are able to better understand the individual trials comprising the review as well as the efficacy of the therapy summarized across all available, relevant evidence. One essential feature that lends confidence to the findings of a review is an appraisal of the methodology of studies comprising the review. In cases where systematic reviewers have concluded that primary studies are of high methodological quality or have low potential for biased outcomes, clinicians can have more confidence in the study findings. For example, Yang et al. evaluated the toxicity and efficacy of chemotherapy plus cetuximab in relation to chemotherapy alone in patients with advanced non-small cell lung cancer. The systematic review comprised of four trials. A risk of bias assessment of these trials was conducted, and the authors concluded that risk of bias was low for overall survival and one-year survival rates but high for all other outcomes due to a lack of blinding. Hence, the reviewers concluded that chemotherapy plus cetuximab was better than chemotherapy alone for improving overall survival; the risk of bias assessment played an important role in the interpretation of the summary effect.

Many scales are designed in response to concerns regarding methodological quality among primary studies; however, recent evidence indicates scales may not be the best way to appraise studies (Jüni 2001)(Jüni 1999). Rather, certain design features should be reviewed to provide a clearer picture of bias in trials (Lohr 1999).The Cochrane Handbook for Systematic Review of Interventions is continually updated to improve the assessment of methodological quality in clinical studies and advocates for appraising the risk of bias of all primary studies included for review (Higgins 2011). Major reporting guidelines for systematic reviews have been published and suggest some form of quality appraisal. The first guideline, published in 1996, was referred to as the Quality of Reporting of Meta-Analyses (QUORUM) and advocated use of a methodological quality measures tool for appraisal. More recently, however, QUORUM’s predecessor, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), called for evaluating the risk of bias of primary studies. This recommendation, consistent with the Cochrane Collaboration, accounts for criticisms of the quality scales, including that certain components of these tools often have no known role in contributing to the validity of findings, such as whether investigators reported oversight by an institutional review board. Inclusion of such items can artificially inflate the overall quality score of a particular study.

Despite a clear move toward progress in this area, there are still significant differences in quality assessment practices between systematic reviews (Higgins 2008). In fact, little is known about the application of methodological quality or risk of bias measures in clinical specialties like oncology. To address this issue, we conducted a study of the oncology literature to assess how often quality and risk of bias assessments were used in oncology systematic reviews, determine the prevalence of approaches reported by the authors, and examine the ways that such evaluations are incorporated into the reviews.


Search criteria and eligibility

Using the h5-Index from Google Scholar Metrics, we selected the six oncology journals with the highest index scores. We searched PubMed search using the following search string: ((((((“Journal of clinical oncology : official journal of the American Society of Clinical Oncology”[Journal] OR “Nature reviews. Cancer”[Journal]) OR “Cancer research”[Journal]) OR “The Lancet. Oncology”[Journal]) OR “Clinical cancer research : an official journal of the American Association for Cancer Research”[Journal]) OR “Cancer cell”[Journal]) AND (“2007/01/01”[PDAT] : “2015/12/31”[PDAT]) AND “humans”[MeSH Terms]) AND (((meta-analysis[Title/Abstract] OR meta-analysis[Publication Type]) OR systematic review[Title/Abstract]) AND (“2007/01/01”[PDAT] : “2015/12/31”[PDAT]) AND “humans”[MeSH Terms]) AND ((“2007/01/01”[PDAT] : “2015/12/31”[PDAT]) AND “humans”[MeSH Terms]). This search strategy was adapted from a previously established method that is sensitive to identifying systematic reviews and meta-analyses (Montori 2005). Searches were conducted on May 18 and May 26, 2015.

Screening and data extraction We used Covidence ( to initially screen articles based on title and abstract. To qualify as a systematic review, articles had to summarize evidence across multiple studies and provide information on the search strategy, such as search terms, databases, or inclusion/exclusion criteria (Babineau 2014). Meta-analyses were classified as quantitative syntheses of results across multiple studies (Onishi 2014). Two screeners independently reviewed the titles and abstracts of each citation and made a decision regarding its suitability for inclusion based on the definitions previously described. Next, the screeners held a meeting to revisit the citations in conflict and arrive at a final consensus. Following the screening process, full-text versions of included articles were obtained via EndNote.

To standardize the coding process, an abstraction manual was developed and pilot tested. After completing this process, a training session was conducted to familiarize coders with abstracting the data elements. A subset of studies was jointly coded. After the training exercise, each coder was provided with three new articles to code independently. Each coder was next assigned an equal subset of articles for data abstraction. We coded the following elements: a) whether methodological quality or risk of bias was assessed, and if so the tool used; b) whether authors developed a customized measure; c) whether methodological quality was scored, and if so, what scale was used; d) whether authors identified high risk of bias or low-quality studies; e) whether high risk of bias or low-quality studies were included in the estimation of summary effects; f) how risk of bias or quality appraisal information was presented in the article; and g) whether follow-up analyses were conducted to explore the effects of bias on study outcomes (such as subgroup analysis, sensitivity analysis, or meta-regression).

Data Analysis

We performed a descriptive analysis of the frequency and percent use of quality assessment tools used, types of tools, types of scales used, how the quality information was presented, and types of methods used to deal with risk of bias or low quality. In assessing the types of tools used to measure quality, we created some additional categories to account for the variations in approaches. We coded an appraisal as “author’s custom measure” if authors described their own approach to evaluating study quality. In situations where the author used a quality assessment method adapted from another study, we coded this as “adapted criteria.” Some studies indicated (either in the abstract or from the methods section) that methodological quality was assessed, but there was no specific detail beyond this generic statement. These were coded as “unspecified.” Statistical analyses were performed with STATA version 13.1 software (State Corporation, College Station, Texas, USA).


The PubMed search resulted in 337 articles from four journals. After screening titles and abstracts, 79 were excluded because they were not systematic reviews or meta-analyses. An additional 76 articles were excluded after full text screening. Two articles could not be retrieved after multiple attempts. A total of 182 articles were included in this study (Figure 1).

Methodological quality or risk of bias assessment was conducted in 42% (77/182) of systematic reviews. Of the 77 articles where assessment of methodological quality or risk of bias was identified, 51.95% (40/77) found either low methodological quality or high risk of bias in primary studies comprising systematic reviews. Studies with an unclear risk of bias or unknown methodological quality were reported in 41.56% (32/77) of reviews; Five cases (6.49%) reported no issues with study quality or risk of bias.

The most common approaches to evaluating risk of bias or methodological quality were those designed by authors (23.4%, 18/77). The Cochrane Risk of Bias Tool was the most commonly reported standardized measure used by systematic reviewers (14.3%, 11/77), followed by the Newcastle–Ottawa scale (10.4%, 8/77), the Jadad scale (10.4%, 8/77), QUADAS-2 (5.19%, 4/77), and QUADAS (3.9%, 3/77). Measures adapted from previous work were reported by 13% (10/77). Other measures used only once are reported in Table 1 and represented 10.4% (8/77) of the approaches used. There were 25 studies with low quality or high risk of bias that were included with (78%, 35/45).From included studies, subgroup analysis was conducted in 13%, 11/77). Meta regression was used to address bias and quality problems in 9% of the 45 articles that assessed quality. Sensitivity analysis was used to address bias and quality reporting issues in 18% of studies analyzed.

We examined the scales by which reviewers scored or categorized studies. This information was reported in 56 systematic reviews. For risk of bias assessments, the high/medium/low format was used most commonly (20%, 11/56) followed by high/low/unclear (14%, 8/56). Methodological quality was most commonly assessed using a 0-5 point scale (16.07%, 9/56) followed by Good/Fair/Poor (7.14%, 4/56) and 1-9 point scale (5.36%, 3/56).

Methodological quality information was articulated largely in narrative format (44%, 34/77) or not at all (44%, 34/77). Additional forms of presentation included combinations of figures and narratives (5%, 4/77) . The combination of table and narrative was also used more than single formats of presentation (3%, 2/77). Single formats of presentation either as a table or figure were used more than the combination of all three forms of presentation (3%, 2/77). The combination of tables, figures, and narrative was used in 1% of assessed articles.


This study provides a comprehensive and recent assessment of methodological quality and risk of bias assessment in oncology journals. Our main findings indicate that reporting of quality assessment in systematic reviews and meta-analyses in major oncology journals is moderate to low, with actual assessment of methodological quality being present in only 48% of studies. This is low in comparison with similar studies assessing frequency of risk of bias evaluations. Hopewell et al., for example, found that 80% of non-Cochrane reviews reported methods for evaluating methodological quality or risk of bias.

The inclusion of studies with high risk of bias or low quality in deriving summary effects was also an issue, with 76% of studies in our sample including such studies in calculating results; however, this is comparable with the proportion of trials with high risk of bias included in previous studies. Hopewell et al. reported that 75% of trials in their study contained one or more trials with a high risk of bias (Hopewell 2013). It should be noted that Hopewell et al. used the Cochrane Database of Systematic Reviews, which is known for its stringent adherence to Cochrane guidelines, of which risk of bias evaluations are a routine part. This may contribute to differences between our findings and theirs.

Despite the presence of high risk of bias or low quality studies, most review authors did not conduct a further analysis to explore the influence of bias on study outcomes. Perhaps more interesting was the number of systematic reviews reporting an assessment of study quality, but leave the reader to wonder what had become of those evaluations. A significant number included such studies without further mention of the quality of evidence. Narrative styles of presenting information for quality assessment were the most common means of presenting this information; however, the use of a table format would provide readers with easier access to quality or risk of bias information. We advocate for a more structured approach, such as tables published in the review, to display such information.

Future research should continue to investigate these evaluation practices in systematic reviews in other clinical specialties. While the majority of research has been confined largely to Cochrane systematic reviews, as well as those published in high profile medical journals, there is a need to understand whether systematic reviewers in clinical specialties conduct these assessments and, if so, what assessments they use.

Chemotherapy with cetuximab versus chemotherapy alone for chemotherapy-naive advanced non-small cell lung cancer.Yang ZY1, Liu L, Mao C, Wu XY, Huang YF, Hu XF, Tang JL.