Spatiotemporal Associations Between Social Vulnerability,
Environmental Measurements, and COVID-19 in the Conterminous United
States.
Daniel P. Johnson, Ph.D.*1, Niranjan
Ravi2, and Christian V. Braneon,
Ph.D.3,4
*Corresponding Author
1. Indiana University - Purdue University at
Indianapolis, Department of Geography
email: dpjohnso@iu.edu2. Indiana
University - Purdue University at Indianapolis, Department of Electrical
and Computer
Engineering3. Goddard Institute for
Space Studies
4. SciSpace, LLC
Key Points
- Patterns of COVID-19 cases and deaths vary considerably through time
and space.
- COVID-19 cases and deaths concentrated in areas of increased social
vulnerability at different times of the year.
- The relationship between social vulnerability, environmental
measurements, and COVID-19 cases and deaths is spatially and
temporally variable.
Abstract
This study introduces the results from fitting a Bayesian hierarchical
spatiotemporal model to COVID-19 cases and deaths at the county-level in
the United States for the year 2020. Two models were created, one for
cases and one for deaths, utilizing a scaled Besag, York, Mollié model
with Type I spatial-temporal interaction. Each model accounts for 16
social vulnerability variables and 7 environmental measurements as fixed
effects. The spatial structure of COVID-19 infections is heavily focused
in the southern U.S. and the states of Indiana, Iowa, and New Mexico.
The spatial structure of COVID-19 deaths covers less of the same area
but also encompasses a cluster in the Northeast. The spatiotemporal
trend of the pandemic in the U.S. illustrates a shift out of many of the
major metropolitan areas into the U.S. Southeast and Southwest during
the summer months and into the upper Midwest beginning in autumn.
Analysis of the major social vulnerability predictors of COVID-19
infection and death found that counties with higher percentages of those
not having a high school diploma and having non-white status to be
significant. Age 65 and over was a significant factor in deaths but not
in cases. Among the environmental variables, above ground level (AGL)
temperature had the strongest effect on relative risk to both cases and
deaths. Hot and cold spots of COVID-19 cases and deaths derived from the
convolutional spatial effect show that areas with a high probability of
above average relative risk have significantly higher SVI composite
scores. Hot and cold spot analysis utilizing the spatiotemporal
interaction term exemplifies a more complex relationship between social
vulnerability, environmental measurements, and cases/deaths.
Plain Language Summary
COVID-19 affects different locations at different points in time and
understanding its impact on communities is an imperative research
effort. Communities that are considered socially vulnerable – less
resilient to hazards – are disproportionately impacted by pandemics and
other environmental stresses. In this study, we utilize a modelling
approach that accounts for COVID-19 cases and deaths, social
vulnerability, environmental measurements and both space and time
domains at the U.S. county level from March 1 – December 31, 2020.
Throughout much of the time period, cases were clustered in the U.S.
South and the states of Indiana, Iowa and New Mexico. Deaths clustered
(although less in extent) in many of these same areas along with the
addition of some urbanized counties in the U.S. Northeast. Measurements
of social vulnerability were higher in these longer term clusters for
both cases and deaths. Examining short-term clusters on a monthly basis,
COVID-19 cases and deaths focused more heavily in socially vulnerable
areas during the summer and autumn months respectively. The individual
social vulnerability variable of not having a high school diploma and
non-white status were the most significant contributors to relative risk
to both cases and deaths. Age 65 and over contributed significantly to
deaths but not to cases. Temperature, with an inverse relationship, had
the strongest effect on risk among the environmental measurements. The
remaining variables had differing levels of importance in the models.
Social vulnerability measures were higher in areas where there was an
increased risk of COVID-19 infection and death during the summer and
autumn respectively.
Data
Data utilized for the conclusions in this
study are available on the Indiana University – Purdue University at
Indianapolis Data Repository. https://doi.org/10.7912/D2/23
(Johnson & Ravi, 2021). These data are in CSV format and readily
importable into the R statistical package or other
platforms.
Index Terms and Keywords
Index Terms: 0240 Public health
0230 Impacts of climate change: human health
0299 General or miscellaneous
Keywords: Spatial epidemiology
Social vulnerability
COVID-19 pandemic
Bayesian spatiotemporal disease mapping
Environmental determinants of COVID-19
Remote sensing and COVID-19
Introduction
The Coronavirus Disease 2019 (COVID-19, ICD-10-CM, U07.1, 2019-nCoV
acute respiratory disease) pandemic is currently affecting much of the
world. As of January 30, 2021, 11 months (325 days) into the pandemic
and one year since the WHO declared COVID-19 a Public Health Emergency
of International Concern (PHEIC), there are over 100 million confirmed
cases of the disease and over 2 million deaths within 223 countries,
areas, or territories (WHO 2021). In the United States, as of the same
time, there are over 25 million confirmed cases and close to 500,000
deaths; 25.25% of cases and 19.62% of deaths worldwide (U.S. CDC
2021). The U.S. only accounts for 4.23% of the global population, so it
is disproportionately affected (U.S. Census Bureau , 2021)
Pandemics, as well as other natural and man-made hazards,
disproportionately impact socially vulnerable individuals and
communities (Freitas & Cidade, 2020; Gaynor & Wilson, 2020; Seddighi,
2020; Usher et al., 2020). The past decade has witnessed an increasing
trend in research activity focusing on social and environmental
vulnerability as it relates to geophysical and man-made hazards. More
recently, there has been vigorous interest in social vulnerability as it
relates to the ongoing COVID-19 pandemic (Bilal et al., 2020; Coelho et
al., 2020; Dasgupta et al., 2020; Gaynor & Wilson, 2020; Khazanchi et
al., 2020; Kim & Bostwick, 2020; Lancet, 2020; Mishra et al., 2020;
Mohanty, 2020; Neelon et al., 2020; Snyder & Parks, 2020).
Additionally, researchers have attempted to construct COVID-19-specific
vulnerability indices, examine spatial relationships or integrate both
social and environmental determinants into a complete model,
illustrating areas more prone to adverse impacts (Khazanchi et al. 2020;
Snyder and Parks 2020; Karaye and Horney 2020).
However, there is a paucity of studies focusing on the spatiotemporal
nature of the pandemic and the relationships between social and
environmental determinants of COVID-19 vulnerability. This study focuses
on a spatiotemporal analysis of the COVID-19 pandemic in the
conterminous United States during the year 2020. This investigation not
only adds to the growing literature on vulnerability and COVID-19, it
also illuminates some of the spatial and temporal underpinnings of the
pandemic in the U.S. In order to achieve this, the presented research
has three specific aims:
- Highlight the spatiotemporal associations between social
vulnerability, environmental measurements and both cases of and deaths
from COVID-19 aggregated by U.S. counties.
- Model the spatial-temporal dimensions of the pandemic and determine if
socially vulnerable counties are more or less impacted at certain
times of the year.
- Create two complementary parsimonious spatiotemporal models - one (1)
for COVID-19 cases, and one (1) for COVID-19 deaths - that take into
account social vulnerability, environmental measurements, and spatial
and temporal random effects.
Background
COVID-19 as a U.S. Health
Disparity
The disproportionate impact of COVID-19 on Black, Indigenous and People
of Color (BIPOC) in the U.S. is ongoing as of early 2021 (LaVeist 2005;
Shi and Stevens 2021). A number of disparities associated with nonwhite
status have been examined in the literature. This list includes among
others cardiovascular disease, chronic respiratory conditions,
hepatitis, and cancer (LaVeist 2005). The fact that COVID-19 is
disproportionately represented in U.S. communities of color is not
surprising (Singu et al. 2020). The reason(s) for the disparity in
representation of COVID-19 cases and deaths within the U.S. population
is likely multifaceted encompassing a variety of cultural, social,
environmental and economic contributors. While Persad et al. have noted
that “racial identity is not an inherent risk factor”, “COVID-19
disparities reflect the health, environmental, and occupational effects
of structural racism” (Persad et al., 2020). The popular press and
media in the U.S. has highlighted the underfunding of preventative
public health infrastructure, an inefficient health-care system,
inadequate governmental response, and systemically racist policies that
have exacerbated the pandemic’s effects. However, one highly probable
contributor is the number and extent of socially vulnerable communities
within the U.S. that demonstrate reduced resiliency in the face of a
hazard (Shi and Stevens 2021; Karaye and Horney 2020; Khazanchi et al.
2020).
Social Vulnerability and
COVID-19
Social vulnerability (SV) as a concept refers to a society or
communities’ impaired ability to respond to an external stressor. These
external pressures can be a single incident or the compounding
consequences of multiple events leading to deleterious effects on the
society or community. Studies highlighting the negative impact COVID-19
has on those considered socially vulnerable have grown exponentially
since the beginning of the pandemic. Here we highlight research that
focuses on social vulnerability as a covariate in geographic ecological
regression studies. Many of these efforts come to similar conclusions;
albeit with different variables being more or less related to COVID-19’s
effects. The studies highlighted utilize the U.S. Centers for Disease
Control and Prevention’s (U.S. CDC) Social Vulnerability Index (SVI),
which we apply in this study (CDC’s Social Vulnerability Index
(SVI) , 2021; Flanagan et al., 2011).
Khazanchi et al., using a quasi-Poisson regression approach, discovered
that counties considered vulnerable had a 1.63-fold greater risk for
COVID-19 diagnosis and a 1.73-fold greater risk for COVID-19-related
death (Khazanchi et al., 2020). When considering only the language and
non-white status domain of vulnerability they found a 4.94 and 4.74-fold
increase in diagnosis and death respectively. Examining counties broken
into the most vulnerable by socioeconomic status, housing and
transportation deficiencies resulted in a higher relative risk (i.e.
were at a greater risk of COVID-19 infection and death). Further effort
by Nayak et al., examining 433 U.S. counties (counties with
>= 50 COVID-19 cases as of April 4, 2020), using a
generalized linear mixed-effect model, found that higher SV was
associated with an increased COVID-19 case fatality rate (CFR) (Aditi
Nayak et al. 2020). The relative risk further increased after adjusting
for age 65 and over. However, the relationship between the overall SVI
score and COVID-19 incidence was not statistically significant. In a
study by Neelon et al (2020) utilizing COVID-19 cases and deaths within
a Bayesian hierarchical negative binomial model between March 1 and
August 31, 2020, counties were classified based on SVI composite
percentiles (Neelon et al., 2020). Cases and deaths were examined daily
for all U.S. counties after adjusting for percentage rural, percentage
poor or in fair health, percentage of adult smokers, county average
daily PM2.5 and primary care physicians per 100,000. By March 30, 2020
relative risk became significantly greater than 1.00 in the most
vulnerable counties. Upper SVI quartile counties had higher death rates
on average beginning on March 30, 2020. By late August the lower
quartiles for SVI began to exhibit increasing levels of cases and
deaths. Dasgupta et al. (2020), examined COVID-19 cases from June 1 –
July 25, 2020 relating them to the CDC’s SVI. Areas with a higher
proportion of individuals with nonwhite status, housing density, and
crowded housing units were more likely to become COVID-19 hot spots;
defined as areas where there is a >60% change in the most
recent 3-7 day COVID-19 incidence rate (Dasgupta et al., 2020). Among
the hot spot counties, those with greater SVI composite scores had
higher COVID-19 incidence rates. Karaye and Horney (2020) examined local
relationships between COVID-19 case counts and SVI utilizing
geographically weighted regression (GWR) (Karaye & Horney, 2020). The
study examined data from January 21 through May 12, 2020 and found,
(after adjusting for population size, population density, number of
persons tested, average daily sunlight, precipitation, air temperature,
heat index, and PM2.5) that non-white status, limited English, household
composition, transportation, housing and disability effectively
predicted case counts in the U.S. Snyder and Parks (2020), in another
spatial analysis (utilizing GWR), which did not utilize the CDC SVI,
found that socio-ecological vulnerability to COVID-19 varied across the
contiguous U.S., with higher levels of vulnerability in the Southeast
and low vulnerability in the Upper Midwest, Great Plains and Mountain
West (Snyder & Parks, 2020).
Environmental Determinants of
COVID-19
Even though we are only 11 months into the pandemic, there is growing
evidence regarding environmental determinants of COVID-19 infection and
mortality. Initially, it was hypothesized that COVID-19 may behave like
many other respiratory infections and cases would subside in the summer
months in the Northern Hemisphere due to increases in temperature
(Hassan et al., 2020; Jamil et al., 2020; Prata et al., 2020).
Therefore, many researchers have concentrated on temperature and less on
other meteorological measurements as a factor in spread of SARS-CoV-2. A
study conducted in Bangladesh, found that high temperature and high
humidity significantly reduce the transmission of COVID-19 when analyzed
using ordinary least squares regression (Haque & Rahman, 2020). Another
analysis utilizing log-linear generalized additive models across 166
countries, revealed that temperature and relative humidity were also
associated with a decrease in COVID-19 cases (Y. Wu et al., 2020). A 1°
C increase in temperature was associated with a 3.08% reduction in
daily new cases and 1.19% reduction in daily deaths across the studied
countries. Relative humidity had a similar effect on cases and deaths.
However, this is not surprising given the calculation of relative
humidity employs a function of temperature. Rouen et al. , utilizing
micro-correlation analysis using a 10-day moving window, found a
negative correlation between temperature and outbreak progression (Rouen
et al., 2020). Their research was conducted across 4 continents in both
hemispheres. Sarkodie and Owusu (2020), using panel estimation
techniques, focused their research on the top 20 countries with COVID-19
cases between January 22 and April 27, 2020 and found that high
temperature and high relative humidity reduced the transmission of
COVID-19 (Sarkodie & Owusu, 2020). However, low temperature, wind
speed, dew/frost point, precipitation and surface pressure increased the
infectivity of the virus.
At a much finer scale, research in China at 31 different provincial
levels revealed a “biphasic” relationship with temperature, using
distributed lag nonlinear models (P. Shi et al., 2020). Epidemic
intensity was slightly reduced on days following higher temperatures and
was associated with a decrease in relative risk. An investigation into
temperature and precipitation’s relationship with COVID-19 in Oslo,
Norway found maximum temperature and average temperature to be
positively associate with COVID-19 transmission and precipitation to
have a negative relationship, using non-parametric correlation estimaton
(Menebo, 2020). Research in the sub-tropical cities of Brazil uncovered
a negative relationship between temperature and COVID-19, using
generalized additive models and polynomial linear regression (Prata et
al., 2020). Bashir et al. (2020) in New York City, New York, USA,
between March and April of 2020, found a significant positive
correlation between average temperature and minimum temperature on total
cases, using Kendall and Spearman rank correlation tests (Bashir et al.,
2020). Average temperature was significantly positively related to
COVID-19 mortality and the minimum temperature was associated with new
cases. Another study, utilizing multi-variate regression focusing on all
U.S. counties from the beginning of the pandemic to April 14, 2020 found
that higher temperatures were associated with a decrease in cases but
not deaths (Li et al., 2020). This sample of studies demonstrates –
particularly at finer spatial and temporal scales and depending on how
temperature is sampled – the relationship between environmental
variables and COVID-19 are complex and variable.
Methods
Study Area and
Timeframe
This study focuses on the counties (sub-state administrative districts)
located in the conterminous United States. We selected this subset due
to Alaska and Hawai’i being non-contiguous and the error and complexity
these “islands” would introduce into the spatial weighting matrices
necessary for the spatiotemporal analysis. Furthermore, we focus on the
timeframe from March 1 to December 31, 2020 (10 full months or 307
days); the first calendar year of the pandemic in the United States.
Data Collection
The data described below in steps 3.2.1 – 3.2.4, was used to create the
dataset for the analysis (Johnson & Ravi, 2021).
COVID-19 Cases and Deaths
COVID-19 cases and deaths were collected from USAFACTS (US
Coronavirus Daily Cases by County , 2021; US Coronavirus Daily
Deaths by County , 2021). These data were retrieved in comma separated
values (csv) format and were grouped into monthly cases and deaths for
all counties of the contiguous U.S. (n = 3106); two counties were
missing data. The expected number of cases and deaths, E , per
county \(\text{area}_{i}\), was determined by calculating the number of
cases and deaths per month and computing the standardized infection rate
(SIR) and standardized mortality rate (SMR) for each area for each
month; the denominator in the rates =Eit ,. This value is later used as
an offset; expected number of cases/deaths at area i duringtime t .
Social Vulnerability
In this analysis we utilize the U.S. CDC’s Social Vulnerability Index
(SVI) (CDC’s Social Vulnerability Index (SVI) , 2021; Flanagan et
al., 2011). The SVI is composed of 18 variables that are related to
social vulnerability and the local socio-ecology at the county or census
tract-level for the entire U.S. We chose the SVI because it is highly
cited in the literature and while there are inherent limitations in all
vulnerability indices it demonstrates greater accuracy and relevancy in
many studies (Bakkensen et al., 2017; Rufat et al., 2019; Spielman et
al., 2020; Tate, 2012). The SVI variables are listed below in Table 1.
<<<Insert Table
1>>>
We utilize the percentage ((variable/total population) * 100) of each
variable (except for per capita income where we used U.S. Dollars $)
for all the selected counties standardized by their respective z-scores.
The SVI also includes 4 themes, based on vulnerability domains, and a
composite score of vulnerability. We utilize the composite SVI score for
comparison of counties after modeling.
Land Surface Temperature
Daily land surface temperature (LST) measurements were collected from
the Moderate Resolution Imaging Spectroradiometer (MODIS) TERRA
satellite system; MOD11a1.006 (Thome, 2020; Wan, Zhengming et
al., 2015). MODIS data has a low spatial resolution (1 km) but a high
(daily) temporal resolution. This remotely sensed data set, an
emissivity corrected land surface temperature image for both daytime and
nighttime, was collected from Google Earth Engine™ using geemap
(Gorelick et al., 2017; Q. Wu, 2020). After collection, the daily values
were averaged per month for each county in the conterminous U.S.
resulting in a monthly average for daytime and nighttime LST. Areas
where cloud cover interfered with image acquisition were assigned a NA
value. These monthly averaged data were standardized by z-score.
Meteorological
Measurements
For additional environmental variables, we utilized the North American
Land Data Assimilation System (NLDAS). NLDAS contains land surface model
datasets available hourly at 1km spatial resolution and is also
accessible in Google Earth Engine™ (Cosgrove et al., 2003; Gorelick et
al., 2017). All but one of the environmental variables listed in Table 2
were averaged by day and then by month for each county and standardized
by their respective z-scores. However, for precipitation, we calculated
a daily sum and then a monthly average before standardizing. After
adjusting for multi-collinearity, the measurements for specific
humidity, longwave radiation, shortwave radiation and potential
evaporation were removed. Specific humidity is the ratio of the mass of
H20(v) per total mass of the air parcel
(kg/kg). This measurement is not a function of temperature and water
content like relative humidity, but we still found a greater than 80%
correlation across all counties throughout the time period with 2m AGL
temperature. The other extraneous environmental variables had
correlation coefficients above .80 with 2m AGL temperature.
Modelling and
Specification
Bayesian Spatial-Temporal
Framework
This study utilizes a Bayesian hierarchical spatiotemporal modeling
approach initialized within the freely available R Statistical platform
and the R-INLA package (R: The R Project for Statistical
Computing , 2021; Rue et al., 2009). Furthermore, all models developed
in this study were executed within Indiana University’s High Performance
Computing (HPC) environment (Research and High Performance
Computing , 2020). Bayesian hierarchical modeling provides a flexible
and robust framework where space-time components can be modeled in a
straightforward manner. There are numerous introductions to Bayesian
disease mapping to which we direct the novice (Best et al., 2005;
Blangiardo et al., 2013; A. B. Lawson, 2013; A. Lawson & Lee, 2017;
Moraga, 2020).
The Bayesian hierarchical methodology offers many benefits. For example,
when creating disease models and relating counts to covariates, it is
unreasonable to assume that one can collect all the necessary variables
that account for a given response. The approach utilized here, allows
for the inclusion of these “unknown” covariates as random effects
within the model (Bernardinelli et al., 1995; Best et al., 2005;
Congdon, 2019). These effects, in the spatial-temporal domain, account
not only for spatial structure (spatial autocorrelation) and noise
(overdispersion), but for temporal correlation and interaction between
space and time (Besag et al., 1991; N. A. Samat & Pei Zhen, 2017; N.A
Samat & Mey, 2017; Ugarte et al., 2014). Also, it is appropriate to
utilize the standardized incidence rate (SIR), number of cases/expected
number of cases, and standardized mortality rate (SMR), number of
deaths/expected number of deaths, for country-level disease modelling.
However, when examining disease measurements at a finer level (i.e.
county-level or smaller), the SIR/SMR, a surrogate for relative risk,
can be unstable and suffer large fluctuations due to some areas
possessing a small population relative to the incidence of disease. The
Bayesian modelling approach utilized “smooths” values of relative risk
through space and time, by “borrowing” information both locally and
globally, thereby reducing the impact of these instabilities
(Bernardinelli et al., 1995; Besag et al., 1991).
In order to model relative risk, observed counts - in this study,
COVID-19 cases of infection and deaths - Yi are
modelled using a Poisson distribution with meanEi θi ; E = expected
counts, θi is the relative risk (RR) of areai. The logarithm of RRi is the sum of an
intercept α and random effects accounting for extra-Poisson variability.
\begin{equation}
Y_{i}\ \sim\ Poisson\left(E_{i}\theta_{i}\right),\ i=1,\ldots,3106\nonumber \\
\end{equation}\begin{equation}
\log\left(\theta_{i}\right)=\alpha+S_{i}+U_{i}\nonumber \\
\end{equation}α is the overall risk in the study area and S and U are
spatial random effects for area i modelling the spatial
dependency structure (S ) and the unstructured uncorrelated noise
(U ). Along with the inclusion of covariates, that determine risk
(i.e. social vulnerability, environmental measurements), and/or other
random effects the overall spatial model can be represented as
\begin{equation}
\log\left(\theta_{i}\right)=\ d_{i}\beta+S_{i}+U_{i}\nonumber \\
\end{equation}di is a vector consisting of the
intercept (α) and β is a coefficient vector; the fixed effects of the
model. The parameters of the 27 fixed covariates included in this study
are each assigned \(\beta_{1:27}\ \sim\ Normal(\mu,\sigma)\) prior
distributions.
A widely cited specification for the random spatial effects S andU , the Besag, York, Mollié (BYM) model, is regularly utilized in
disease mapping studies (Besag et al., 1991). In the BYM model, the
spatial random effect S is assigned a conditional autoregressive
(CAR) distribution; smoothing the associated data based on a specified
neighborhood structure, where neighbors are defined as areas sharing a
common border.
\begin{equation}
S_{i}|S_{-i}\sim N\left({\overset{\overline{}}{S}}_{\delta_{i},\ \ }\ \frac{\sigma_{S}^{2}}{n_{\delta_{i}}}\right)\nonumber \\
\end{equation}\begin{equation}
{\overset{\overline{}}{S}}_{\delta_{i}=\ n_{\delta_{i}}^{-1}\sum_{j\in\delta_{i}}S_{j}}\nonumber \\
\end{equation}\begin{equation}
\delta_{i}=set\ of\ neighbors\nonumber \\
\end{equation}\begin{equation}
n_{\delta_{i}}=number\ of\ neighbors\ of\ area\ i\nonumber \\
\end{equation}The unstructured component U is modeled as independent and
identically distributed (IID) with mean of zero and variance =\(\sigma_{U}^{2}\). Therefore, data is shared both locally through theS component and globally through the U component.
In this study we follow the parameterization of the Besag, York, Mollié
(BYM) model proposed by Simpson at al. (2015) that enables assigning
penalized complexity (PC) priors (Simpson et al., 2015). This so-called
“BYM2” model, incorporates a scaled spatially structured and
unstructured component (S* and U*) and
is defined as:
\begin{equation}
\log\left(\theta_{i}\right)=\ d_{i}\beta+\frac{1}{\sqrt{\tau}}(\sqrt{1-\ \varphi}\ S_{*}+\ \sqrt{\varphi}U_{*})\nonumber \\
\end{equation}The mixing - between S * andU * - parameter φ(\(0\leq\varphi\leq 1\)) measures the proportion of variance
explained by S *. This scales the BYM2 model
making it equal to the spatial model when φ = 1 and equal to only
unstructured spatial noise when φ = 0 (Riebler et al., 2016). We
set priors for these parameters following suggestions by Simpson et al.,
2015. PC priors as their name suggest penalize model complexity. In this
case they penalize based on the degree with which a given model deviates
from a foundational assumption of no spatial dependency (φ = 0).
Conjoining the random spatial effects for each area
(S* + U *) is termed the
convolutional spatial component. The exponential factor for the
convolutional spatial effect \(e^{\left(S_{*}+U_{*}\right)}\)provides one with RR contribution of the random spatial effects
additively. Repeating this procedure for either S* or
U* will provide the relative contribution of each and
allow the determination of the comparative contribution to variance
(spatial fraction). This scaling parameterization makes the BYM2
representation more interpretable between models than the unscaled BYM
model.
The above specification for log(θi) can be extended into
the spatial-temporal domain by the addition of further random effects.
\begin{equation}
\log\left(\theta_{\text{it}}\right)=\ d_{i}\beta+\frac{1}{\sqrt{\tau}}(\sqrt{1-\ \varphi}\ S_{*}+\ \sqrt{\varphi}U_{*})+\ \gamma_{t}+\omega_{t}+\delta_{\text{it}}\nonumber \\
\end{equation}Here, \(\gamma_{t}\ \&\ \omega_{t}\), correspondingly represent the
temporally structured and temporally unstructured random effect.
Typically, \(\gamma_{t}\), is modeled as a conditional autoregressive
random walk of either order one or two (RW1, RW2), but there can be
additional specifications (i.e. seasonal). In the present study, we
model the temporally structured effect as RW1, and specify\(\omega_{t}\) as a Gaussian exchangeable IID
(\(\omega_{t}\ \sim\ Normal(0,\frac{1}{\tau_{\omega}}))\). The
space-time interaction component \(\delta_{\text{it}}\), represents a
parameter vector that varies jointly through space and time. This vector
allows for deviations from the space and time structure that expresses
both dynamic spatial changes from one time frame to another and active
temporal patterns from one area to another (Knorr-Held, 2000).
Therefore, mapping \(\delta_{\text{it}},\) characterizes short-term
clusters of disease activity that deviate from the space-time average
over the study area at time t .
Ecological Regression
The covariates representing SVI and environmental measurements (after
correcting for multi-collinearity) were included in the models as fixed
effects, to examine which measurements are intricately linked to the
spatiotemporal processes of the pandemic in the U.S. This study opted to
include all variables that logically fit into the framework of
vulnerability regardless of their statistical significance provided
there is limited issues with multi-collinearity. In addition to
accounting for the variables, much of this decision is based on the
potentially poor inference generated by utilizing a stepwise framework
and the Deviance Information Criteria (DIC) not significantly decreasing
when the variables were removed (Greenland et al., 2016; Huberty, 1989).
Even though a particular variable might not be statistically significant
it is nonetheless important to see its effect in the model and to
compare between COVID-19 cases and deaths when an alternative approach,
aimed at reducing variables based on their significance, might result in
less comparable models. Furthermore, since the response is logarithmic
we calculate the exponential of the mean of the β coefficients and
subtract from 1.00 to determine each variables effect on relative risk.
Model Selection Criteria
Deviance Information Criteria (DIC), was employed to select the most
parsimonious model (Spiegelhalter et al., 2014). During exploratory data
analysis, we examined two different prior specifications on the
covariates:\(\ \)
\begin{equation}
\beta_{1:27}\sim\ Uniform\left(-\infty,+\infty\right)\nonumber \\
\end{equation}\begin{equation}
\beta_{1:27}\sim Normal\left(\mu,\sigma\right)\nonumber \\
\end{equation}These resulted in minimal changes to the models and we selected the
specification for the covariates which produced the lowest DIC score; in
this case the normal prior which produced a DIC score of~10 less than the uniform specification
(DIC Cases: Normal 243958.04 vs. Uniform 243969.28 / DIC Deaths: Normal
90824.00 vs. Uniform 90835.76); a somewhat significant reduction
(Spiegelhalter et al., 2014). Furthermore, we utilized all 4 types of
spatial-temporal interactions suggested by Knorr-Held (2000) and found
that Type I best fit our data, temporal stratification, and modeled
process. Therefore, in our modeling approach we imposed no restrictions
on when or where a space-time anomaly could occur (Type I
spatial-temporal interaction).
Disease Mapping
Another key benefit of Bayesian inference is the creation of the
posterior distribution where one can generate the probability of
exceeding a certain threshold; the so-called exceedance probability. For
this purpose, we mapped the convolutional spatial effect,\(e^{\left(S_{*}+U_{*}\right)}\), and the spatially structured
effect, \(e^{\left(S_{*}\right)}\), at the county level. Also, we
plotted the modeled space time interaction, \(e^{\delta_{\text{it}}}\),
which represents short-term clusters of activity relative to the
study-area average at time t . In these maps we followed the
classification rule followed by Richardson (2004); areas where\(\Pr\left(\theta_{\text{it}}\geq 1\right)\ is\ \geq\ .8\), is
considered a hot spot,\(\Pr\left(\theta_{\text{it}}\geq 1\right)\ is\ .8\ \geq\ .2\), is
considered areas statistically similar to the national average, and\(\Pr\left(\theta_{\text{it}}\geq 1\right)\ is\ \leq\ .2\),
signifies a cold spot; areas that represent infection/mortality rates
below the national average (Richardson et al., 2004). Counties that are
considered hot spots through the convolutional spatial effect, the
spatially structured effect, and the space time interaction component
are compared to the SVI composite score of the combined average and cold
spot areas; areas where\(\Pr\left(\theta_{\text{it}}\geq 1\right)\ is\ <\ .8\) (non-hot
spot areas). This assessment, utilizing notched box-plots, is employed
to examine differences in the SVI composite score between the affected
areas and in the case of \(e^{\delta_{\text{it}}}\), different times.
Limitations and Caveats
A potential limitation of this study - likely not a significant
constraint - is the lack of greater temporal resolution with regard to
the social vulnerability index. In the case of the CDC’s SVI, the index
is calculated either yearly or every other year. This study focuses on
COVID-19 on a monthly basis and there is no available capability of
measuring changes in social vulnerability at that temporal resolution.
That considered, it is likely that many of the individual variables that
are used to define social vulnerablity do not change dramatically from
one month to the next (Flanagan et al., 2011; Neelon et al., 2020; L.
Shi & Stevens, 2021). Also, the internal limitations present in the SVI
are extend to our analysis (Bakkensen et al., 2017; Rufat et al., 2019;
Tate, 2012)
Another potential limitation is the number of zeros the dataset for
COVID-19 fatalities contains; at least initially. We could have opted
for a Zero-Inflated Poisson model for deaths but decided to keep the
hierarchical specifications and structure as consistent as possible
between COVID-19 cases and deaths. By doing so, we eliminate a level of
complexity within the model and maintain their comparability.
Finally, we decided not to place temporal lags on the environmental
variables in relation to COVID-19 cases and deaths. Estimates for a
latency in exposure to COVID-19 and onset of symptoms ranges from 2-24
days (CDC, 2020, p. 19; Grant et al., 2020). The WHO estimates that
there is a temporal lag of 2-8 weeks from onset of symptoms to death in
the most severe cases (Baud et al., 2020; Woolf et al., 2021). Given
these large ranges of temporal associations and the aggregation of the
data by month, we opted to compare SVI and environmental measurements on
the date where a case or death was reported. Therefore, the coefficients
should be interpreted in the proper context and with this consideration
in mind.
Results
Temporal Trends in COVID-19 Cases and
Deaths
COVID-19 cases by month per 100,000 people is presented in Figure 1A.
Cases steadily increased until October, with an exponential increase
through October until the end of 2020. Deaths from COVID-19, presented
in Figure 1B, increased exponentially between March and April, then
decreased and remained fairly stable through October, with another
exponential increase in November and December.
<<<Insert Figure
1>>>
Spatiotemporal Ecological
Regression
The coefficients resulting from the spatiotemporal ecological regression
model for COVID-19 infections are presented in Table 2. The variables
grayed out are considered not statistically significant since 0 falls
within the 95% credibility interval (Wang et al., 2018). The variables
unemployed, age 65 and up, disabled, crowded living, no vehicle, limited
English, LST nighttime, AGL temperature, and precipitation have a
negative relationship with COVID-19 infection risk. The strongest effect
on cases is from the variable “no high school diploma” with a 20.60%
increase in risk from a one standard deviation (6.34%) increase. AGL
temperature is the second strongest with a 20.70% decrease in risk
resulting from an 8.58k (1 standard deviation) increase. The remaining
effects of each variable are presented in Table 2.
<<<Insert Table
2>>>
Table 3 contains the coefficient results from the posterior
distributions of the ecological regression model for COVID-19
fatalities. The variables with the strongest impact (percent increase)
on risk to COVID-19 deaths are non-white status (+40.19%), no high
school diploma (35.21%), AGL temperature (-26.68%), age 65 and over
(23.93%), and age 17 and under (19.42%) after a standard deviation
increase in each variable respectively. Unemployed, single parent,
mobile home, group quarters, poverty, uninsured, LST daytime were not
statistically significant.
<<<Insert Table
3>>>
Modelled Temporal Trend
Figure 2 exhibits the temporally structured γtand unstructured ωt effects for both cases and
deaths. The left panel shows the modeled structured temporal effect for
cases with all covariates, following the random walk order-1 and IID
specification for unstructured temporal effects. The structured
component shows an increase in relative risk until August with a
decrease for the remainder of the year. The unstructured effect tends to
fluctuate between being slightly above 1.00 to slightly below 1.00; with
its 95% credibility envelope easily encompassing 1.00. In the right
panel for deaths, γt, increases until June, drops
in July, increases until September, and drops until the end of the study
period. Similar to the cases panel, ωt ,
fluctuates between being slightly above 1.00 to slightly below 1.00.
<<<Insert Figure
2>>>
Spatial Effects
The convolutional spatial effect and the spatially structured effect for
COVID-19 cases are mapped in Figure 3A. These figures show the
probability that the relative risk exceeds 1.00; the national average.
There is a strong clustering of high probability for the convolutional
spatial effect in Florida, Alabama, Mississippi, Louisiana, Arkansas,
Tennessee, Iowa, and Arizona. There is sporadic clustering of high
probability most prominent of which is in Indiana, Kansas, and Colorado.
There is significant clustering of low probability areas in the
Northeast, Pacific Northwest, Upper Atlantic Coast, Upper Midwest,
Michigan, and West Virginia. The spatially structured effect for cases
follow a similar pattern to the convolutional effect as it explains
82.7% of the variance in the overall spatial effects. Key differences
include some higher probabilities in Connecticut and Iowa. Probability
is lower in Southern California, Michigan, and New Mexico.
<<<Insert Figure
3>>>
Figure 3B shows exceedance probabilities for the convolutional spatial
effect and the spatially structured effect for deaths. There is a strong
clustering of high probabilities in New Mexico, Indiana, Louisiana,
Eastern Pennsylvania, and the Northeast megalopolis. Higher
probabilities are scattered throughout the Southeast and portions of the
Midwest into Montana, Idaho and Eastern Oregon. The spatially structured
effect accounts for 60.9% of the variance for
(S*+U*) so similarities are expected,
although not as high of a degree as in the spatial effect for cases. The
most notable difference is the increase in clustering in the Southeast,
the increase in Indiana, and less sporadic dispersion of counties in the
Midwest into the Mountain West.
Spatiotemporal
Interaction
Cases
Exceedance probabilities using 1.00 as a threshold for the
spatiotemporal interaction term are presented in Figure 4, for cases,
and Figure 7, for deaths. Initial clustering of high probabilities in
March are noted in the Northeast, especially New York, Florida,
Louisiana and counties containing some of the major metropolitan areas
around the country (i.e., Atlanta, Denver, Detroit, Chicago,
Indianapolis, San Diego, Los Angeles, San Francisco, Portland, and
Seattle). Lower probabilities are less stable than in April and this is
likely due to the average risk being so low at this point in time. In
April, the areas noted previously in March have expanded in what appears
to be a diffusion pattern; in many cases doubling in extent. Lower
probability areas are more prominent and are focused in the Upper
Midwest south through the Great Plains into Texas. There is another
notable area of low probabilities in Ohio south through West Virginia,
west into Kentucky and further south into Tennessee. In May, areas
previously noted at high probability have remained fairly stable, with a
notable decrease in probability in upper New York, the Upper Northeast
and Michigan. There is a crescent of lower probability extending from
Western Pennsylvania, through West Virginia and west to Kansas and
Oklahoma. There is a notable increase in cases in Minnesota and upper
Iowa.
By June there are some significant changes to areas of high probability.
The pandemic seems to have settled much more into the southern U.S.
focusing again in Florida, Alabama, Mississippi, Louisiana, eastern
Texas, South and North Carolina. Probabilities have decreased in the
Northeast and throughout much of the Midwest apart from much of Iowa and
southern Minnesota. High probabilities remain in Arizona and have
expanded into Utah, Nevada, and much of California. July shows a further
solidification of the pandemic in the southern U.S. extending from the
Atlantic to the Pacific coast. Arizona northward into Utah and Idaho has
joined this high probability area. Metropolitan areas in Minnesota,
Wisconsin, Ohio, and Pennsylvania are showing renewed higher
probabilities. The Northeast and the middle Midwest are the lowest
probability areas, with Illinois and Indiana continuing a trend of
decreasing activity. By August, the pandemic is shifting out of the
southern U.S. and into the Midwest with increases from Tennessee into
Minnesota, North Dakota, and South Dakota. The Mountain West is
exhibiting an increase in activity as is much of California. Arizona and
New Mexico are showing a decrease in probability and the Northeast
remains firmly in the lower category.
September, witnesses the pandemic lessening in the southern U.S., but
the increases previously noted in the Upper Midwest have become even
more pronounced, with Missouri, Illinois, Iowa, Wisconsin, Minnesota,
Kansas, Nebraska, and North and South Dakota heavily burdened. The
pandemic continues to lessen in Arizona and California. By October the
pandemic continues to rage in the Upper Midwest affecting much of the
counties in the states from Wisconsin to Idaho. There is a notable
lessening in southern Minnesota and Iowa. The pandemic continues to
subside in the southern U.S. and remains stable in the Northeast.
November, expresses a further strengthening in the upper Midwest with
areas previously showing a lessening pattern overrun by cases. Much of
the U.S. is affected apart from the southern U.S., California, Arizona,
Washington, and the Northeast. Through December, the pandemic has
diminished in the Upper Midwest and shifted with higher probabilities
into the Northeast and Texas and has further reasserted itself on the
Pacific Coast. The upper Midwest and extreme South continue a waning
effect apart from Florida which displays a resurgence.
<<<Insert Figure
4>>>
4.5.2. Deaths
The spatiotemporal interaction term probability exceedances for deaths
are shown in Figure 5. As expected there is not much activity in March
apart from a deaths in some major cities. By April, deaths begin to show
in many of the major metropolitan areas of the U.S. with a clustering of
high probabilities in the New York City area, Chicago, Detroit,
Indianapolis, Atlanta, San Diego, Los Angeles, San Francisco, Portland,
and Seattle. Through May, many of these areas of high probability have
expanded in a similar apparent diffusion pattern to cases a month or so
earlier. Much of the Northeast megalopolis is affecting along with
Detroit, Cleveland, Pittsburg, Chicago, Indianapolis, Nashville,
Birmingham, New Orleans, and counties in New Mexico, Arizona, and
Southern California. In June, many of the counties around these same
cities have become even more heavily burdened, with notable clusters in
the Northeast, Ohio, eastern Michigan, northern Illinois, central
Indiana, central Mississippi, Arizona and southern California. Through
July many of these areas are showing a decrease in deaths apart from the
northeast megalopolis, Chicago, central Indiana, Arizona and southern
California.
Through August, probabilities of high deaths have shifted to the
southern U.S., while lessening in the Northeast and Midwest. The
probabilities have strengthened in Arizona and much of California.
September witnesses a further strengthening of probabilities in the
southern U.S. affecting much of South Carolina, southern Georgia and
much of Florida. The pandemic continues to strengthen in Arizona and
California. The waning trend has continued throughout much of the upper
Northeast and the Ohio Valley. By October, these shifts continue with
sporadic higher probabilities throughout the southern U.S. The lessening
continues in the Ohio Valley and upper Northeast as Arizona begins a
trend of diminishing deaths. November witnesses a shift in the
probability of deaths into the upper Midwest, as suspected considering
cases witnessed a similar trend a few months prior. The pandemic begins
to lessen in the southern U.S.; but contains some sporadic high
probabilities. However, the shift to the north is apparent. Lessening
continues in Arizona and California with the same effect stable in the
upper Northeast. Through December, the shift in to the upper Midwest is
even more evident with the Mountain West now included. December further
witnesses a resurgent trend in the upper Northeast, especially northern
New York, Vermont, New Hampshire and Maine. The continued lessening in
the southern U.S., Arizona and California is noteworthy.
<<<Insert Figure
5>>>
Comparisons of Composite Vulnerability in Hot and Cold
Spots
Figure 6A displays the boxplots comparing hot and cold spots for the
convolutional spatial effect (\(e^{\left(S_{*}+U_{*}\right)}\)) for
COVID-19 infections. The light gray distribution is for areas having a
probability less than .80 of exceeding 1.00 (cold spots) and the red
distribution for areas where the probability is greater than or equal to
.80 of exceeding 1.00 (hot spots). Hot spot areas have a significantly
higher SVI composite score compared to the low probability areas. Figure
6B illustrates the boxplot comparing areas delineated in the same way to
the composite SVI score for COVID-19 fatalities. Likewise, area with a
higher probability of COVID-19 deaths have a statistically significant
higher SVI composite score. The composite SVI score for the spatially
structured effect is similarly higher in areas of higher probability,
which is expected based on the percent of variance explained in the
convolutional effect by that component.
<<<Insert Figure
6>>>
Taken in the spatiotemporal context, the relationship between higher SVI
composite scores and the hot and cold spots are not as straightforward.
Figure 7A and 7B displays boxplots comparing the SVI composite score for
the areas by month that have a high probability of exceeding 1.00;
within the spatiotemporal interaction component\(\left(e^{\delta_{\text{it}}}\right)\). The individual plots are
delineated in the same way as above. Comparing the distributions for SVI
composite scores to cases (9A) from April through August the score is
higher and statistically significant in hot spot counties. Similarly,
for deaths (9B) the SVI score is higher for the counties involved for
the months July – October. The mean SVI score for the hot spots during
these month nears or exceeds the 3rd quartile value in
the cold spot counties. November also witnesses a higher but not
statistically significant SVI score interpreted via the notches in the
boxplots.
<<<Insert Figure
7>>>
Discussion
Relationships Between Social
Vulnerability Variables and COVID-19 Infections and
Deaths
Dividing the SVI index into its separate variables and including them in
the hierarchical model offered some noteworthy results. Counties that
have a high percentage of those without a high school diploma and
non-white status have the strongest positive effect on risk; raising
risk for COVID-19 infections by 21.44% and 20.60% respectively.
Counties with greater percentages of those aged 17 and under are at
greater risk; increasing risk by 11.78% for each standard deviation
increase (3.43%). Living in a multi-unit dwelling increased risk by
11.18% which is expected due to the density of living spaces. However,
living in a home with more than 10 people (crowded living) lowered risk
by 5.13%. Additionally, percent unemployed, disabled, no vehicle,
limited English, and uninsured negated risk to varying degrees (see
Table 2). These findings tend to support other studies that show
COVID-19 having a disparate effect in non-white communities (LaVeist
2005; Shi and Stevens 2021; Singu et al. 2020; Karaye and Horney 2020;
Khazanchi et al. 2020). They especially support Karaye and Horney (2020)
with conclusions related to non-white status, but not as significantly
as their findings related to limited English, household composition,
transportation, housing and disability (Karaye & Horney, 2020). This is
likely due to their study focusing on cases through May 12, 2020, so the
comparison in the amount of data and the timeframe of investigation is
different. Dasgupta et al. (2020) found that non-white status and
crowded housing were more likely to become COVID-19 hot spots during
June and July 2020. Our study supports these findings for non-white
status, multi-unit dwelling and group quarters throughout the year 2020.
The SVI index variables and the relationship with death from COVID-19
possesses some dissimilarities compared to cases. Non-white status had
the strongest impact on risk of death; increasing relative risk by
40.19% when raised by one standard deviation (19.84%). No high school
diploma is again ranked second with a 35.31% increase. Furthermore, age
65 and over demonstrates an increase in relative risk of death by
23.93%; this variable was statistically insignificant in the model for
cases. Age 65 and above is a highly recognized individual risk factor
for COVID-19 severe disease and mortality (Woolf et al., 2021). Age 17
and under is again relatively high in its impact on risk with a 23.93%
increase. Limited English lowered risk by 13.35% (compared to 5.10%
for cases) and Multi-unit dwelling increased risk by 11.29%. The result
for limited English potentially implies that decreased social
connections due to the perceived language barrier may have a
preventative effect. The remaining variables are either not
statistically significant or offer minimal (less than 10%) impact on
deaths. Although there are not as many previous studies that examined
deaths as opposed to cases, these findings especially support Khazanchi
et al. (2020) where non-white status resulted in a 4.74-fold increase in
death.
Relationships Between Environmental
Variables and COVID-19 Infections and
Deaths
Regarding environmental variables in relation to COVID-19 cases, above
ground level (AGL) temperature had the strongest relationship. A one
standard deviation increase (8.58K) in temperature results in a 20.70%
reduction in relative risk for cases when examined within the context of
the entire study. Nighttime LST results in a 11.82% decrease in risk
when increased by a single standard deviation (14.61K). Atmospheric
pressure produces a 12.34% increase in relative risk for cases when
increased by 5465.36 Pa. Precipitation increases lower risk by 4.07%
and wind speed (2.84%) and direction (11.79%) have a positive effect
on risk of infection. These findings support research which points to
increases in temperature lowering the risk of COVID-19 infections (Haque
& Rahman, 2020; Prata et al., 2020; Rouen et al., 2020; Sarkodie &
Owusu, 2020; P. Shi et al., 2020). Our precipitation finding contradicts
Sarkodie and Owusu (2020) where they found a positive association
between temperature and COVID-19 cases. However, it does support Menebo
(2020) where both variables had a negative association. In relationship
to winds, prior research has found a negative association with wind
speed and COVID-19 incidence (Islam et al., 2020; Şahin, 2020). However,
in our study average monthly wind direction, when the azimuth is
increased by 35 degrees, increased risk by 11.79% and average monthly
wind speed, when increased by .69m/s, raises risk by 2.84%. While this
relationship is difficult to explain it is worth noting and we opted to
keep wind data in the analysis to account for its potential effects.
AGL temperature has the strongest effect on risk from death of the
environmental variables with a 26.68% decrease in risk for every 8.58K
increase. LST nighttime also lowers risk by 11.14% when increased by
14.61K. As atmospheric pressure increases by 5465.36 Pa, it raises the
relative risk of death by 14.66%. Precipitation increases by a standard
deviation lower risk by 5.62%. Wind direction is significant in the
model for deaths. As the azimuth of wind increases by 35° the relative
risk lowers by 6.99%. Wind speed is not statistically significant.
Again, these findings are for wind 10m above ground level and support
the notion that wind may have some impact on COVID-19, although it’s
difficult to infer. More research is needed on wind’s relationship,
especially at a finer temporal stratification (i.e. daily) where it
should be easier to infer the relationship (Islam et al., 2020; Şahin,
2020).
Temporal Structure of the Pandemic in the United
States
Cases and deaths have clearly increased throughout the year 2020 as
evidenced by Figure 1. However, the modeled relative risks have
fluctuated throughout the time period (Figure 2). These relative risks
as modeled through random walk order – 1 show rapid increases in cases
and deaths from March through much of the summer of 2020. Decreases are
then evident for the remainder of the year. Temporal relative risk from
death shows more fluctuation than cases but also presents a steady
decline in average relative risk for the last few months of 2020. On the
surface, these results (Figure 1 vs Figure 2) may seem contradictory.
Apart from one chart showing per capita COVID-19 cases/deaths and the
other modeled relative risk, closer examination of the maps of the
spatiotemporal interactions (Figures 4 and 5) show there are more
counties affected in the latter months of 2020. This observation
suggests the pandemic has broadened in spatial extent, especially in
regard to cases during October and November, but has become less intense
overall as measured by relative risk. This finding also implies the
pandemic may be starting to decline in average intensity – in the U.S.
– as we head into 2021. The average non-modelled standardized infection
rate (SIR) and standardized mortality rate (SMR) across all counties,
support this and demonstrate a similar relative risk trend and
corresponding decrease for the latter months of 2020.
Spatial Structure of the Pandemic in the United
States
After adjusting for the fixed effects of the covariates and the
temporally structured and unstructured random effects, the convolutional
spatial effect risk map and the spatially structured effect risk map
identified counties at increased risk of COVID-19 infection and death
throughout the study period. The most prominent spatial aspect of
relative risk to COVID-19 infection were the clusters most heavily
focused in the southern U.S., and the states of Indiana, Iowa, and New
Mexico. Somewhat supported by Snyder and Parks (2020), with their
geographically weighted regression (GWR) approach, this result was not
completely unanticipated (Snyder & Parks, 2020). There were additional
sporadic areas of increased risk scattered throughout the Midwest, U.S.
South and Great Plains. Strong degrees of spatial autocorrelation, which
supports the clustering, as modelled with the spatially structured
effect, were present in many of those same areas.
Convolutional risk from COVID-19 deaths were less focused than cases but
were still prevalent in the U.S. South, especially Louisiana and
Tennessee. Indiana and the megalopolis region of the Northeast were also
part of this cluster. Interestingly, the latter region did not present
as an increased risk area relative to cases throughout the study period.
This is a potentially alarming finding suggesting this region has a
higher rate of death relative to the number of cases throughout the
year. Further west, nearly every county in New Mexico was at an elevated
risk. A large degree of spatial autocorrelation (as the spatially
structured effect accounts for nearly 61% of the variance between the
two components) is also present in many of these identical areas based
on the spatially structured risk from death.
Spatiotemporal Structure of the Pandemic in the United
States
The spatiotemporal interaction term is a random effect and can be
interpreted as the modelled residual risk after accounting for the fixed
effects, spatially structured and unstructured and temporally structured
and unstructured effects. This represents short-term (month long in our
study) sporadic clusters of COVID-19 cases and deaths. During the time
period of the study, it is notable that areas impacted by the pandemic
shift drastically throughout the country. Cases shift from major
metropolitan areas and the Northeast, into the Southern and Southwest
U.S. in the summer months. This trend supports evidence found by Snyder
and Parks (2020) that the pandemic was focusing in the highly vulnerable
U.S. South by mid-summer. By late summer and early autumn, elevated
cases have moved into the Upper Midwest with a second shift into the
Southwest and a re-emergence in the Northeast by the end of 2020.
The short-term patterns in deaths follow a similar trend but are delayed
by approximately a month to month-and-a-half and are not as large in
extent or as contiguous. This implies there is a temporal delay from
cases to deaths that falls within the WHO suggestion of 2-8 weeks;
tending toward the higher end (Baud et al., 2020). Elevated deaths shift
into the Southwest by early summer and into the Southern U.S. by late
summer. Increased probabilities of above average deaths have moved from
the Northeast megalopolis by August as they refocus in the South. By
November, deaths have moved into the Upper Midwest and are beginning to
shift out of the U.S. South (apart from Florida); into less socially
vulnerable areas of the country. December shows the U.S. South to be
below average risk along with the Southwest. A further broadening is
evident in the Upper Midwest and Mountain West.
Composite Social Vulnerability in Hot and Cold
Spots
The convolutional spatial effect for COVID-19 cases was classified into
hot and cold spots based on the criteria of Richardson et al. (2004).
For COVID-19 cases of infection, areas that were considered hot spots
throughout the course of the study period had higher SVI composite
scores (.57) than those that were considered cold spots (.48). In
relation to deaths, there was a similar trend with higher SVI composite
scores for hot spot areas (.55) versus cold spot areas (.49). These
differences were statistically significant at .05 confidence interval.
These relationships provide further evidence to studies finding
locations that have higher vulnerability scores to be more at-risk of
COVID-19 infection and death (Dasgupta et al., 2020; Karaye & Horney,
2020; Khazanchi et al., 2020).
When comparing SVI composite scores to hot and cold spots for cases and
deaths based on the spatiotemporal interaction term, the relationship is
not as straightforward. In respect to cases, our study supports the
findings of Nayak et al., where the relationship between composite SVI
score and COVID-19 cases is not statistically significant in the first
month of the pandemic in the U.S.(Nayak et al. 2020). The SVI composite
score is higher from April to August and again in December in hot spot
counties. This observation supports Neelon et al., where counties with
higher SVI scores contained higher COVID-19 cases through August.
Furthermore, this result verifies Dasgupta et al., where higher SVI
scores in June and July 2020 supported the probability of becoming a
COVID-19 hotspot. Neelon et al. also discovered that in August, counties
with lower composite SVI scores were beginning to be affected. This is
also supported by our study, but the composite index is lower in hot
spot areas extending into the months after August. For deaths, the index
is higher from July through October, with the remaining months either
lower or not statistically significant. In the case of August, September
and October, the SVI composite score is much higher in hot spots than in
cold spots. The mean SVI score in hot spot areas is approaching the
3rd quartile value in cold spot areas.
Much of this observed shift in the relationship between COVID-19 and
vulnerability is due to the spatial-temporal nature of the pandemic in
the U.S. Cases shift into the southern and Southwest U.S. in the summer
months. These areas are known to have significantly larger numbers and
percentages of vulnerable than most other areas of the country (L. Shi
& Stevens, 2021). Likewise, in the instance of deaths in August,
September, and October, they are likewise focused in some of the more
vulnerable locations of the American Southeast and Southwest. These
results support a more complex or nuanced relationship between
temperature and COVID-19 cases and deaths; especially when examined in a
smaller spatial and temporal context (Bashir et al., 2020; P. Shi et
al., 2020). The overall time period relationships are not stable across
all counties for all time periods. By utilizing the spatiotemporal
modelling approach used in this study we are able to uncover this more
elusive space-time relationship. Alternatively, in areas where the SVI
composite score in hot spots is lower than in cold spots, the pandemic
has shifted into less socially vulnerable areas of the country; the
Upper Midwest and Northeast. However, a key finding is that during the
summer months (for cases) and autumn (for deaths) the pandemic seemed to
shift into warmer and more socially vulnerable areas of the country.
Future spatiotemporal analyses, after the behavior of the pandemic in
2021, are needed to determine if this is likely to be a trend the virus
follows or if this is simply how the pandemic initially diffused in the
U.S.
Conclusion
Bayesian hierarchical modeling provides a flexible and robust framework
in which to model complex spatiotemporal systems. This study presents
the findings of fitting a Bayesian hierarchical spatiotemporal model to
COVID-19 cases and deaths in the United States for the year 2020. The
data collection and modeling framework are explained in detail for
straightforward replication. This hopefully will foster continued effort
in modeling the spatiotemporal nature of the pandemic in the U.S. and
abroad.
A key finding of this study is the spatiotemporal character of the
pandemic in the U.S. after accounting for social vulnerability,
environmental measurements and spatial and temporal random effects. In
terms of cases, the pandemic shifted into the U.S. South and Southwest
during the hotter months of the year, which are the most vulnerable
regions of the U.S. Deaths followed the same pattern only with a
1-2-month temporal lag. This demonstrates a potentially alarming trend
if this pattern or a similar one is repeated in other years. These
highly vulnerable areas are already under significant stress at this
time of the year and the introduction of another stressor on a regular
basis could be catastrophic in some areas. Further alarming, is the
cluster of deaths in the Northeast Megalopolis region.
Studies such as the one presented here provide insight into the
complicated mix of social and environmental factors relating to
vulnerability. We demonstrate that relationships between COVID-19
cases/deaths, social vulnerability, and environmental measurements are
spatially and temporally variable. Even though many of the findings
presented here are supportive of other studies, more work is needed in
the spatial-temporal domain of the pandemic. One primary effort should
be modelling the spatiotemporal structure at a finer temporal scale
(i.e. weekly). This could elucidate other relationships with the
examined variables and allow for more elaborate spatial temporal
interaction specifications (Type IV spatial temporal interaction). Such
examinations, perhaps at finer spatial scales (i.e. census tract), could
also allow us to focus on some of this study’s more alarming results.
Acknowledgments
This research was funded by an internal Indiana University grant
sponsored by the Indiana University Vice President of Research, IU
COVID-19 Rapid Research OVPR and the Indiana Space Grant Consortium.
References
Bakkensen, L. A., Fox‐Lent, C., Read, L. K., & Linkov, I. (2017).
Validating Resilience and Vulnerability Indices in the Context of
Natural Disasters. Risk Analysis , 37 (5), 982–1004.
https://doi.org/10.1111/risa.12677
Bashir, M. F., Ma, B., Bilal, Komal, B., Bashir, M. A., Tan, D., &
Bashir, M. (2020). Correlation between climate indicators and COVID-19
pandemic in New York, USA. Science of The Total Environment ,728 , 138835. https://doi.org/10.1016/j.scitotenv.2020.138835
Baud, D., Qi, X., Nielsen-Saines, K., Musso, D., Pomar, L., & Favre, G.
(2020). Real estimates of mortality following COVID-19 infection.The Lancet Infectious Diseases , 20 (7), 773.
https://doi.org/10.1016/S1473-3099(20)30195-X
Bernardinelli, L., Clayton, D., Pascutto, C., Montomoli, C., Ghislandi,
M., & Songini, M. (1995). Bayesian analysis of space—Time variation
in disease risk. Statistics in Medicine , 14 (21–22),
2433–2443. https://doi.org/10.1002/sim.4780142112
Besag, J., York, J., & Mollié, A. (1991). Bayesian image restoration,
with two applications in spatial statistics. Annals of the
Institute of Statistical Mathematics , 43 (1), 1–20.
Best, N., Richardson, S., & Thomson, A. (2005). A comparison of
Bayesian spatial models for disease mapping. Statistical Methods
in Medical Research , 14 (1), 35–59.
Bilal, U., Barber, S., Tabb, L., & Diez-Roux, A. V. (2020). Spatial
Inequities in COVID-19 Testing, Positivity, Incidence and Mortality in 3
US Cities: A Longitudinal Ecological Study. MedRxiv .
Blangiardo, M., Cameletti, M., Baio, G., & Rue, H. (2013). Spatial and
spatio-temporal models with R-INLA. Spatial and Spatio-Temporal
Epidemiology , 7 , 39–55.
https://doi.org/10.1016/j.sste.2013.07.003
CDC. (2020, February 11). Coronavirus Disease 2019 (COVID-19) .
Centers for Disease Control and Prevention.
https://www.cdc.gov/coronavirus/2019-ncov/index.html
CDC’s Social Vulnerability Index (SVI) . (2021, January 19).
https://www.atsdr.cdc.gov/placeandhealth/svi/index.html
Coelho, F. C., Lana, R. M., Cruz, O. G., Villela, D. A., Bastos, L. S.,
Pastore y Piontti, A., Davis, J. T., Vespignani, A., Codeço, C. T., &
Gomes, M. F. (2020). Assessing the spread of COVID-19 in Brazil:
Mobility, morbidity and social vulnerability. PloS One ,15 (9), e0238214.
Congdon, P. (2019). Spatial heterogeneity in Bayesian disease mapping.GeoJournal , 84 (5), 1303–1316.
Cosgrove, B. A., Lohmann, D., Mitchell, K. E., Houser, P. R., Wood, E.
F., Schaake, J. C., Robock, A., Marshall, C., Sheffield, J., Duan, Q.,
Luo, L., Higgins, R. W., Pinker, R. T., Tarpley, J. D., & Meng, J.
(2003). Real‐time and retrospective forcing in the North American Land
Data Assimilation System (NLDAS) project. Journal of Geophysical
Research: Atmospheres , 108 (D22), 2002JD003118.
https://doi.org/10.1029/2002JD003118
Dasgupta, S., Bowen, V. B., Leidner, A., Fletcher, K., Rose, C., Cha,
A., Kang, G., Dirlikov, E., Pevzner, E., & Rose, D. (2020). Association
Between Social Vulnerability and a County’s Risk for Becoming a COVID-19
Hotspot—United States, June 1–July 25, 2020. Morbidity and
Mortality Weekly Report , 69 (42), 1535.
DeCaprio, D., Gartner, J., Burgess, T., Kothari, S., & Sayed, S.
(2020). Building a COVID-19 vulnerability index. ArXiv Preprint
ArXiv:2003.07347 .
Flanagan, B. E., Gregory, E. W., Hallisey, E. J., Heitgerd, J. L., &
Lewis, B. (2011). A Social Vulnerability Index for Disaster Management.Journal of Homeland Security and Emergency Management ,8 (1). https://doi.org/10.2202/1547-7355.1792
Freitas, C. M. de, & Cidade, N. da C. (2020). COVID-19 AS A GLOBAL
DISASTER: Challenges to risk governance and social vulnerability in
Brazil. Ambiente & Sociedade , 23 .
Gaynor, T. S., & Wilson, M. E. (2020). Social Vulnerability and Equity:
The Disproportionate Impact of COVID-19. Public Administration
Review , 80 (5), 832–838.
Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., &
Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial
analysis for everyone. Remote Sensing of Environment , 202 ,
18–27. https://doi.org/10.1016/j.rse.2017.06.031
Grant, M. C., Geoghegan, L., Arbyn, M., Mohammed, Z., McGuinness, L.,
Clarke, E. L., & Wade, R. G. (2020). The prevalence of symptoms in
24,410 adults infected by the novel coronavirus (SARS-CoV-2; COVID-19):
A systematic review and meta-analysis of 148 studies from 9 countries.PLoS ONE , 15 (6).
https://doi.org/10.1371/journal.pone.0234765
Greenland, S., Daniel, R., & Pearce, N. (2016). Outcome modelling
strategies in epidemiology: Traditional methods and basic alternatives.International Journal of Epidemiology , 45 (2), 565–575.
https://doi.org/10.1093/ije/dyw040
Haque, S. E., & Rahman, M. (2020). Association between temperature,
humidity, and COVID-19 outbreaks in Bangladesh. Environmental
Science & Policy , 114 , 253–255.
https://doi.org/10.1016/j.envsci.2020.08.012
Hassan, M. M., Zowalaty, M. E. E., Khan, S. A., Islam, A., Nayem, M. R.
K., & Järhult, J. D. (2020). Role of Environmental Temperature on the
Attack rate and Case fatality rate of Coronavirus Disease 2019
(COVID-19) Pandemic. Infection Ecology & Epidemiology ,10 (1), 1792620. https://doi.org/10.1080/20008686.2020.1792620
Huberty, C. (1989). Problems with Stepwise Methods—Better
Alternatives. Advances in Social Science Methodology , 1 ,
43–70.
Islam, N., Shabnam, S., & Erzurumluoglu, A. M. (2020). Temperature,
humidity, and wind speed are associated with lower Covid-19 incidence.MedRxiv , 2020.03.27.20045658.
https://doi.org/10.1101/2020.03.27.20045658
Jamil, T., Alam, I., Gojobori, T., & Duarte, C. M. (2020). No Evidence
for Temperature-Dependence of the COVID-19 Epidemic. Frontiers in
Public Health , 8 . https://doi.org/10.3389/fpubh.2020.00436
Johnson, D. P., & Ravi, N. (2021). Spatiotemporal Associations
Between Social Vulnerability, Environmental Measurements, and COVID-19
in the Conterminous United States [Data set]. IUPUI University
Library. https://doi.org/10.7912/D2/23
Karaye, I. M., & Horney, J. A. (2020). The impact of social
vulnerability on COVID-19 in the US: An analysis of spatially varying
relationships. American Journal of Preventive Medicine ,59 (3), 317–325.
Khazanchi, R., Beiter, E. R., Gondi, S., Beckman, A. L., Bilinski, A.,
& Ganguli, I. (2020). County-level association of social vulnerability
with COVID-19 cases and deaths in the USA. Journal of General
Internal Medicine , 35 (9), 2784–2787.
Kim, S. J., & Bostwick, W. (2020). Social Vulnerability and Racial
Inequality in COVID-19 Deaths in Chicago. Health Education &
Behavior , 1090198120929677.
Knorr-Held, L. (2000). Bayesian modelling of inseparable space-time
variation in disease risk. Statistics in Medicine ,19 (17–18), 2555–2567.
Lancet, T. (2020). Redefining vulnerability in the era of COVID-19.Lancet (London, England) , 395 (10230), 1089.
LaVeist, T. A. (2005). Minority Populations and Health: An
Introduction to Health Disparities in the United States . John Wiley &
Sons.
Lawson, A. B. (2013). Bayesian disease mapping: Hierarchical
modeling in spatial epidemiology . CRC press.
Lawson, A., & Lee, D. (2017). Bayesian disease mapping for public
health. In Handbook of statistics (Vol. 36, pp. 443–481).
Elsevier.
Li, A. Y., Hannah, T. C., Durbin, J. R., Dreher, N., McAuley, F. M.,
Marayati, N. F., Spiera, Z., Ali, M., Gometz, A., Kostman, J., &
Choudhri, T. F. (2020). Multivariate Analysis of Black Race and
Environmental Temperature on COVID-19 in the US. The American
Journal of the Medical Sciences , 360 (4), 348–356.
https://doi.org/10.1016/j.amjms.2020.06.015
Menebo, M. M. (2020). Temperature and precipitation associate with
Covid-19 new daily cases: A correlation study between weather and
Covid-19 pandemic in Oslo, Norway. Science of The Total
Environment , 737 , 139659.
https://doi.org/10.1016/j.scitotenv.2020.139659
Mishra, S. V., Gayen, A., & Haque, S. M. (2020). COVID-19 and urban
vulnerability in India. Habitat International , 103 ,
102230.
Mohanty, S. K. (2020). Contextualising geographical vulnerability to
COVID-19 in India. The Lancet Global Health , 8 (9),
e1104–e1105.
Moraga, P. (2020). Geospatial Health Data: Modeling and
Visualization with R-INLA and Shiny . CRC Press.
https://www.paulamoraga.com/book-geospatial/
Nayak, A. (2020). Impact of Social Vulnerability on COVID-19
Incidence and Outcomes in the United States | medRxiv .
https://www.medrxiv.org/content/10.1101/2020.04.10.20060962v2
Neelon, B., Mutiso, F., Mueller, N. T., Pearce, J. L., &
Benjamin-Neelon, S. E. (2020). Spatial and temporal trends in social
vulnerability and COVID-19 incidence and death rates in the United
States. MedRxiv .
Persad, G., Peek, M. E., & Emanuel, E. J. (2020). Fairly Prioritizing
Groups for Access to COVID-19 Vaccines. JAMA , 324 (16),
1601. https://doi.org/10.1001/jama.2020.18513
Prata, D. N., Rodrigues, W., & Bermejo, P. H. (2020). Temperature
significantly changes COVID-19 transmission in (sub)tropical cities of
Brazil. Science of The Total Environment , 729 , 138862.
https://doi.org/10.1016/j.scitotenv.2020.138862
R: The R Project for Statistical Computing . (2021).
https://www.r-project.org/
Research and high performance computing . (2020).
https://kb.iu.edu/d/apes
Richardson, S., Thomson, A., Best, N., & Elliott, P. (2004).
Interpreting posterior relative risk estimates in disease-mapping
studies. Environmental Health Perspectives , 112 (9),
1016–1025.
Riebler, A., Sørbye, S. H., Simpson, D., & Rue, H. (2016). An intuitive
Bayesian spatial model for disease mapping that accounts for scaling.Statistical Methods in Medical Research , 25 (4),
1145–1165. https://doi.org/10.1177/0962280216660421
Rouen, A., Adda, J., Roy, O., Rogers, E., & Lévy, P. (2020). COVID-19:
Relationship between atmospheric temperature and daily new cases growth
rate. Epidemiology & Infection , 148 .
https://doi.org/10.1017/S0950268820001831
Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian
inference for latent Gaussian models by using integrated nested Laplace
approximations. Journal of the Royal Statistical Society: Series B
(Statistical Methodology) , 71 (2), 319–392.
https://doi.org/10.1111/j.1467-9868.2008.00700.x
Rufat, S., Tate, E., Emrich, C. T., & Antolini, F. (2019). How Valid
Are Social Vulnerability Models? Annals of the American
Association of Geographers , 109 (4), 1131–1153.
https://doi.org/10.1080/24694452.2018.1535887
Şahin, M. (2020). Impact of weather on COVID-19 pandemic in Turkey.Science of The Total Environment , 728 , 138810.
https://doi.org/10.1016/j.scitotenv.2020.138810
Samat, N. A., & Pei Zhen, W. (2017). Relative Risk Estimation for
Dengue Disease Mapping in Malaysia based on Besag, York and Mollié
Model. PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY , 25 (3),
759–765.
Samat, N.A, & Mey, L. W. (2017). Malaria disease mapping in Malaysia
based on Besag-York-Mollie (BYM) Model. Journal of Physics:
Conference Series , 890 (1), 012167.
Sarkodie, S. A., & Owusu, P. A. (2020). Impact of meteorological
factors on COVID-19 pandemic: Evidence from top 20 countries with
confirmed cases. Environmental Research , 191 , 110101.
https://doi.org/10.1016/j.envres.2020.110101
Seddighi, H. (2020). COVID-19 as a natural disaster: Focusing on
exposure and vulnerability for response. Disaster Medicine and
Public Health Preparedness , 1–2.
Shi, L., & Stevens, G. (2021). Vulnerable Populations in the
United States . John Wiley & Sons.
Shi, P., Dong, Y., Yan, H., Zhao, C., Li, X., Liu, W., He, M., Tang, S.,
& Xi, S. (2020). Impact of temperature on the dynamics of the COVID-19
outbreak in China. Science of The Total Environment , 728 ,
138890. https://doi.org/10.1016/j.scitotenv.2020.138890
Simpson, D. P., Rue, H., Martins, T. G., Riebler, A., & Sørbye, S. H.
(2015). Penalising model component complexity: A principled, practical
approach to constructing priors. ArXiv:1403.4630 [Stat] .
http://arxiv.org/abs/1403.4630
Singu, S., Acharya, A., Challagundla, K., & Byrareddy, S. N. (2020).
Impact of Social Determinants of Health on the Emerging COVID-19
Pandemic in the United States. Frontiers in Public Health ,8 . https://doi.org/10.3389/fpubh.2020.00406
Snyder, B. F., & Parks, V. (2020). Spatial variation in
socio-ecological vulnerability to COVID-19 in the contiguous United
States. Health & Place , 66 , 102471.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A.
(2014). The deviance information criterion: 12 years on. Journal
of the Royal Statistical Society. Series B (Statistical Methodology) ,76 (3), 485–493.
Spielman, S. E., Tuccillo, J., Folch, D. C., Schweikert, A., Davies, R.,
Wood, N., & Tate, E. (2020). Evaluating social vulnerability
indicators: Criteria and their application to the Social Vulnerability
Index. Natural Hazards , 100 (1), 417–436.
https://doi.org/10.1007/s11069-019-03820-z
Tate, E. (2012). Social vulnerability indices: A comparative assessment
using uncertainty and sensitivity analysis. Natural Hazards ,63 (2), 325–347. https://doi.org/10.1007/s11069-012-0152-2
Thome, K. (2020). MODIS | Terra . 610 WebDev.
https://terra.nasa.gov/about/terra-instruments/modis
Ugarte, M. D., Adin, A., Goicoa, T., & Militino, A. F. (2014). On
fitting spatio-temporal disease mapping models using approximate
Bayesian inference. Statistical Methods in Medical Research ,23 (6), 507–530. https://doi.org/10.1177/0962280214527528
U.S. Census Bureau . (2021). https://www.census.gov/popclock/
US Coronavirus Daily Cases by County . (2021, March 3).
USAFacts.Org.
https://usafactsstatic.blob.core.windows.net/public/data/covid-19/covid_confirmed_usafacts.csv
US Coronavirus Daily Deaths by County . (2021, March 3).
USAFacts.Org.
https://usafactsstatic.blob.core.windows.net/public/data/covid-19/covid_deaths_usafacts.csv
Usher, K., Bhullar, N., Durkin, J., Gyamfi, N., & Jackson, D. (2020).
Family violence and COVID-19: Increased vulnerability and reduced
options for support. International Journal of Mental Health
Nursing .
Wan, Zhengming, Hook, Simon, & Hulley, Glynn. (2015). MOD11A1
MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1km SIN
Grid V006 [Data set]. NASA EOSDIS Land Processes DAAC.
https://doi.org/10.5067/MODIS/MOD11A1.006
Wang, X., Yue, R. Y., & Faraway, J. J. (2018). Bayesian
Regression Modeling with INLA (1st ed.). CRC press.
https://www.routledge.com/Bayesian-Regression-Modeling-with-INLA/Wang-Ryan-Faraway/p/book/9780367572266
Woolf, S. H., Chapman, D. A., & Lee, J. H. (2021). COVID-19 as the
Leading Cause of Death in the United States. JAMA , 325 (2),
123–124. https://doi.org/10.1001/jama.2020.24865
Wu, Q. (2020). geemap: A Python package for interactive mapping with
Google Earth Engine. Journal of Open Source Software ,5 (51), 2305. https://doi.org/10.21105/joss.02305
Wu, Y., Jing, W., Liu, J., Ma, Q., Yuan, J., Wang, Y., Du, M., & Liu,
M. (2020). Effects of temperature and humidity on the daily new cases
and new deaths of COVID-19 in 166 countries. Science of The Total
Environment , 729 , 139051.
https://doi.org/10.1016/j.scitotenv.2020.139051