2.4. Statistical analysis
We used mixed-effects negative binomial regression models to generate incidence rate ratios (IRRs) and 95% confidence intervals for each predictor in the model. We modeled case and death outcomes separately for each Phase, and we fit case and death models inclusive and exclusive of cases/deaths identified as institutional (yielding eight models in total). A random effect of town (351 towns in MA) was included to address within-town spatial autocorrelation of residuals for nearby tracts. We used counts of cases or deaths at each census tract as the outcome variable, with census tract population used as an offset term to reflect consistent rates. Predictors that affected modeling estimates significantly or that demonstrated changes between the Phases were retained in the models, as were predictors of a priori interest to health disparities or specific COVID-19 risk factors regardless of statistical significance (e.g., housing unit density and proportion of AIAN residents). All statistical analyses were conducted in R (version 4.0.3) using the “glmmTMB” function from the glmmTMB package (version 1.0.2.9).
3. RESULTS
Total cases, deaths, and community characteristics differed between Phase 1 and Phase 2 of the COVID-19 pandemic in Massachusetts (Table 1). Phase 1 had substantially fewer cases than Phase 2 (99,051 vs. 407,525), but more deaths (7,285 vs. 6,207). Compared to Phase 1, non-institutional outcomes in Phase 2 accounted for greater shares of total cases (96.6% vs. 80.1%) and deaths (57.0% vs. 37.0%). Geocoding was highly successfully at matching individuals with census tracts of residence, with each outcome group having at least a 99.7% match rate; in total, 1,360 cases (0.27%) were excluded from the models due to inability to geocode to a census tract.