loading page

Automated Citation Searching in Systematic Review Production: A Simulation Study Protocol and Framework
  • +2
  • Darren Rajit,
  • Lan Du,
  • Helena Teede,
  • Emily Callander,
  • Joanne Enticott
Darren Rajit
Monash Centre for Health Research and Implementation
Author Profile
Lan Du
Monash University Faculty of Information Technology
Author Profile
Helena Teede
Monash Centre for Health Research and Implementation
Author Profile
Emily Callander
University of Technology Sydney Faculty of Health
Author Profile
Joanne Enticott
Monash Centre for Health Research and Implementation

Corresponding Author:[email protected]

Author Profile

Abstract

Citation mining, citation searching or snowball searches have been recommended as a supplementary search method in the conduct of systematic searches for evidence retrieval as part of systematic review production. However, manual methods are extremely costly and time-consuming, with limited empirical evidence for their utility, and limited guidance on how best to incorporate the method during systematic review production. Encouragingly, the advent of programmatic access to bibliographic databases has enabled exploration of automated citation mining for a potentially scalable and replicable approach. Thus, the study aims to simulate and evaluate the use of exclusively automated citation searching methods for evidence retrieval compared to reference standard boolean logic-based methods, and to explore the factors that influence performance. Methods: A total of 30 systematic reviews will be retrieved from the Cochrane Database of Systematic Reviews, Campbell Systematic Reviews and the Collaboration for Environmental Evidence (CEE). Baseline characteristics will be extracted, including the performance of the reference standard boolean search strategy in terms of recall, precision and F(1-3)-score for each sample review. Seed articles from the background and methods section of each sample review and their baseline characteristics will then be extracted, and automated citation searching will be conducted for different seed article and database combinations (Semantic Scholar, OpenAlex). Each seed article candidate will be ranked according to recall, and the top 10 seed articles will be combined in all possible combinations and evaluated. The end performance of automated citation searching will then be compared against the reference standard Boolean strategy for each sample review. The association of factors related to i) automated citation search parameters, ii) characteristics related to review question, and iii) characteristics related to the initial starting set of seed articles will be evaluated. Empirical guidance surrounding the use of automated citation searching will then be generated.