loading page

A systematic literature review of the public health informatics predictive models that utilize data from EHR systems as a data source
  • +3
  • Jim Clavin,
  • Miles Norton,
  • Anup Haridas,
  • Linda Le,
  • Jack Shan,
  • Raad Mustafa
Jim Clavin
University of Maryland, Baltimore County
Author Profile
Miles Norton
University of Maryland, Baltimore County
Author Profile
Anup Haridas
University of Maryland, Baltimore County
Author Profile
Linda Le
University of Maryland, Baltimore County
Author Profile
Jack Shan
University of Maryland, Baltimore County
Author Profile
Raad Mustafa
University of Maryland, Baltimore County
Author Profile

Abstract

Background:  A systematic literature review was executed to identify data sources used in place of, side by side with, or in conjunction with, electronic health record (EHR) data in predictive models for influenza like illness (ILI) outbreaks.
Objectives:  To determine how predictive models for ILI outbreak use EHR data and how often EHR data is used in ILI surveillance and forecasts.
Methods:  Articles were sourced from Pubmed and the Journal of Medical Internet Research (JMIR). Results from these online databases were filtered down to a corpus of 48 studies. From these studies, 10 dummy and 10 categorical variables were identified and placed into a Google sheet; data visualizations were built from the Google sheet using Tableau public; and descriptive analytics reviewed.
Results:  From the articles, eighty-four data sources were identified, of which 14 (or 17%) were data from EHRs. EHR data was utilized in 5% of those studies that also leveraged either governmental or syndromic surveillance data.  Likewise, EHR data was used in 5% of studies that incorporated Google search and trend data.  Most studies' models used autoregression (15%), with machine learning algorithms referenced second most often (13%).  The utilization of EHR data was found only in the United States (9 studies) and Europe (4 studies).  
Conclusion:  EHR data used in tandem with other data sets in an ensemble approach, or in isolation, can be used by predictive models to signal alert levels earlier than existing government-provided models in those regions where such data is available but its adoption remains limited.