Modelling incident case numbers
Extending standard models with GTD improved model quality (AIC without
GTD: 6041.446; with GTD: 6038.929). To validate results, the same model
fitting with data describing mostly the decreasing phase of the epidemic
was performed. These findings may confirm the results of the previous
modelling; model quality was slightly better with the addition of GTD
data (AIC without GTD: 13101.59; with GTD: 13100.27). Detailed results
are available as Supplementary material (S1).
Examination of social media use and its relationship to disease
incidence is now commonplace. across multiple countries.
Crosscorrelations showed a clear relationship between GTD and reported
case incidence across a number of European countries. The quality of
time series modelling, as indicated by AIC values, was also enhanced by
the addition of GTD. This suggests that such data could be of real
utility in disease modelling and possibly forecasting across country
boundaries. This could be of potential utility where traditional disease
surveillance is challenging.
Country specific factors, possibly differences in testing and case
reporting probably plays a critical role. Reported case numbers may not
truly reflect disease occurrence, possibly only how vigorous testing
regimes are. This was mitigated by examining increasing and decreasing
phases of the epidemic separately; reported case numbers are likely to
be more reliable at the beginning of an epidemic when the majority of
cases can be identified.
A recent review highlighted the fact that although the number of studies
examining the relationship between Internet searching and disease
occurrence is growing, few such studies go beyond data description and
use such data in disease modelling and forecasting
(Mavragani et al.,
2018). This is demonstrated by previous studies on conditions related
to COVID-19. For example, studies examining GTD and MERS-Cov outbreak
(Fung et al., 2013;
Shin et al., 2016). Studies examining the correlation between GTD and
COVID-19 are rapidly appearing
(Effenberger et al.,
2020; Husnayain et al., 2020; Walker and Sulyok, 2020) . However, none
of these studies has used such data in disease modelling as was
performed here.
In conclusion, GTD showed a strong contemporaneous correlation with
incident case numbers across Europe. It also enhanced the quality of
disease models using solely case numbers for a range of European
countries. This improvement suggests such techniques could be used
across country boundaries. This is potentially important as COVID-19
reaches new states, especially ones where testing and surveillance are
not as reliable as in Europe.
Conflict of Interest Statement : We have no conflict of interest
to declare.
Ethic Statement : The authors confirm that the ethical policies
of the journal, as noted on the journal’s author guidelines page, have
been adhered to. No ethical approval was required since
completely anonymized data were obtained from publicly available
sources.
Data Availability Statement: All data and the statistical
analyses code are available under the link:
https://github.com/msulyok/COVID19GoogleTrendsEurope