Risk of Bias Assessments in Ophthalmology Systematic Reviews and Meta-Analyses
In order for systematic reviews to make accurate inferences concerning clinical therapy, the primary studies that constitute the review must provide valid results. The Cochrane Handbook for Systematic Reviews states that assessment of validity is an “essential component” of a review that “should influence the analysis, interpretation, and conclusions of the review”(p. 188) (Higgins 2008). The internal validity of a review’s primary studies must be considered to ensure that bias has not compromised the results, leading to inaccurate estimates of summary effect sizes.
In ophthalmology, there is a need for closer examination of the validity of primary studies comprising a review. As an illustrative example, Chakrabarti et al. (2012) discussed emerging ophthalmic treatments for proliferative (PDR) and nonproliferative diabetic retinopathy (NDR) noting that anti-vascular endothelial growth factor (VEGF) agents consistently received recognition as a possible alternative treatment for diabetic retinopathy. Treatment guidelines from the Scottish Intercollegiate Guidelines Network and the American Academy of Ophthalmology consider anti-VEGF treatment as merely useful as an adjunct to laser for treatment of PDR; however, the Malaysian guidelines indicate that these same agents were to be considered in combination with intraocular steroids and vitrectomy. Most extensively, the National Health and Medical Research Council guidelines recommend the addition of anti-VEGF to laser therapy prior to vitrectomy (Chakrabarti 2012). The evidence base informing these guidelines is comprised of trials of questionable quality. Martinez-Zapata et al. (2014) conducted a systematic review of this anti-VEGF treatment for diabetic retinopathy, which included 18 randomized controlled trials (RCTs). Of these trials, seven were at high risk of bias while the rest were unclear in one or more domains. The authors concluded, “
Over the years, researchers have conceived many methods in attempt to evaluate the validity or methodological quality of primary studies. Initially, checklists and scales were developed to evaluate whether particular aspects of experimental design, such as randomization, blinding, or allocation concealment were incorporated into the study. These approaches have been criticized for falsely elevating quality scores. Many of these scales and checklists include items that have no bearing on the validity of study findings, such as whether investigators used informed consent or whether ethical approval was obtained (Moher 1995). Furthermore, with the proliferation of quality appraisal scales, it was found that the choice of scale could alter the results of systematic reviews due to weighting differences of scale components (Jüni 1999). Two such scales, the Jadad scale - also called the Oxford Scoring System (Jadad 1996) and the Downs and Black checklist (Downs 1998) were among the popular alternatives. Quality of Reporting of Meta-analyses (QUORUM) (Moher 1999), the dominant reporting guidelines at that time, called for the evaluation of methodological quality of the primary studies in systematic reviews. This recommendation was short lived as the Cochrane Collaboration began to advocate for a new approach to assess the validity of primary studies. This new method assessed the risk of bias of 6 particular design features of primary studies, with each domain receiving a rating of either low, unclear, or high risk of bias (Higgins 2008). Following suit, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) - updated reporting guidelines, now calls for the evaluation of bias in all systematic reviews (Moher 2009).
A previous review examining primary studies from multiple fields of medicine revealed that the failure to incorporate an assessment of methodological quality can result in the implementation of interventions founded on misleading evidence (Kjaergard 2001). Yet, questions remain regarding the assessment of quality and risk of bias in clinical specialties. Therefore, we examined ophthalmology systematic reviews to determine the degree to which methodological quality and risk of bias assessments were conducted. We also evaluated the particular method used in the evaluation, the quality components comprising these assessments, and how systematic reviewers integrated primary studies with low quality or high risk of bias into their results.
We conducted a PubMed search of MEDLINE for systematic reviews and/or meta-analyses published in the American Journal of Ophthalmology, British Journal of Ophthalmology, Investigative Ophthalmology and Visual Science, The Journal of the American Medical Association- Ophthalmology, Ocular Surface, Ophthalmology, and Progress in Retinal and Eye Research from 2005 to 2015. We used the following search string: (((((((((“Progress in retinal and eye research”[Journal])) OR “Archives of ophthalmology”[Journal]) OR “Ophthalmology”[Journal]) OR “The ocular surface”[Journal]) OR “American journal of ophthalmology”[Journal])) OR “Investigative ophthalmology & visual science”[Journal])) OR “The British journal of ophthalmology”[Journal]AND ((((meta-analysis[Title/Abstract]) OR meta-analysis[MeSH Terms]) OR systematic review[Title/Abstract]) OR systematic review[MeSH Terms]) OR meta-analysis[Publication Type]. This search strategy was a modification of Montori et al., (Montori 2005) which has shown to be sensitive to identifying systematic reviews and meta-analyses. The search was conducted on January 30, 2015. Prior to screen and data abstraction, an abstraction manual was developed to standardize coding practices. This manual was pilot tested using a subset of 25 systematic reviews. Revisions were made as necessary. Following the pilot test, we held a training session for coders based on the manual using a subset of 5 systematic reviews. Results were discussed between coders and any discrepancies were resolved by consensus. Rater agreement was also calculated on a randomly selected subset of 10 systematic reviews and found to be 99.09%. After training, all full-text articles were retrieved and screened during the coding process. The types of excluded articles are detailed in Figure 1. We coded the following elements: (a) name of first author; (b) year of publication; (c) name of journal; (d) whether author addressed quality/risk of bias; (e) what tool was used for quality/risk of bias assessment; (f) whether author used custom measures for quality/risk of bias assessment; (g) whether primary articles in review were graded; (f) what scale was used for grading; (h) whether quality/risk of bias was found; (i) whether quality/risk of bias was included in review; (j) whether a follow up analysis was conducted (subgroup, meta-regression, sensitivity analysis).