AGE Modeling Methods

AGE Rate Models

Inpatient AGE Model

AGE rates for the inpatient setting was specified using a Poisson model for weighted counts of AGE cases.

\[w_{a,t} \sim \text{Poisson}(\lambda_{a,t} n_{a,t})\]

where \(\lambda\) is the rate, \(n\) is the subgroup population size and \(w\) is the weighted count of AGE patients. Subscripts \(a,y\) identify age and year subgroups, respectively. To account for imperfect observation of the AGE inpatient population, the observed cases \(y\) were weighted according to a binomial relationship:

\[y_{a,t} \sim \text{Binomial}(w_{a,t}, \pi_y)\]

where \(\pi\) is the weighting factor, and is the product of two quantities: (1) monitoring rate \(m\) of the inpatient hospital setting, in days per week; (2) VUMC market share \(s_t\) of Davidson county during year \(t\):

\[\pi_t = m \times s_t\]

The market shares were assumed to be 0.94, 0.9 and 0.83 for years 2012, 2013 and 2014, respectively and the monitoring rate was 1 (i.e. 7 days out of 7).

Thus, this model accounts for two sources of stochastic uncertainty, one related to the appearance of cases in the sample, via the binomial sampling model, and another related to the disease process itself, via the Poisson count model.

Outpatient AGE Model

The AGE model for the outpatient setting is similar, except the population from which we are sampling is the total number of outpatients seen, rather than the population served by VUMC. Since monitoring intensity was 4 days per week for outpatients, the outpatient sample is correspondingly weighted by \(m=0.57\) (i.e. 4/7).

Hence, the sample is modeled directly as a Poisson random variable:

\[y_{a,t} \sim \text{Poisson}(\lambda_{a,t} n_{a,t})\]

ED AGE Model

The AGE model for the ED setting is identical to that of the inpatient setting, save for an additional factor in the weighting term. This factor accounts for the expected relative number of patients enrolled in 8-hour monitoring relative to full 24-hour monitoring. Using available information regarding ED visits, we were able to estimate that our 8-hour surveillance period accounts for 55% of the expected number enrolled under 24-hour surveillance. This additional scaling factor \(\delta=0.55\) was added to the weighting for the observed sample:

\[\pi_t = m \times s_t \times \delta\]

The ED monitoring rate was \(m=0.57\) and market shares for all years were \(s_1=0.60, s_2=0.59, s_30.62\).

Virus Rate Models

Rate models for individual viruses extends the model for overall AGE. Primarily, this involves accounting for the stool sampling process, since stools are required for virus detection and not all enrolled patients consent to a stool sample. The stool collection probability \(\psi\) was modeled as a beta-distributed random variable, and estimated via stool collection counts \(x^{(e)}\) and enrolled patients \(n^{(e)}\) in the inpatient setting with a binomial likelihood:

\[\psi \sim Beta(10, 1)\]

\[x^{(e)} \sim Binomial(n^{(e)}, \psi)\]

The stool collection probability was then used to estimate the number of enrolled cases of a particular virus in a given setting, by correcting the observed number of cases for stool collection bias. This, too, was estimated using a binomial likelihood:

\[x^{(v)}_i \sim Binomial(n^{(v)}_i, \psi)\]

where \(x^{(v)}_{a,t}\) is the observed number of cases for virus \(v\) and \(n^{(v)}_{a,t}\) the total number (observed and unobserved) of enrolled cases for subgroup \(a,t\) at VUMC. The unknown true number of cases is therefore bounded below by the number of observed cases, and above by the number of enrollments. Therefore, a discrete uniform prior was specified for this random variable:

\[n^{(v)}_{a,t} \sim Uniform(x^{(v)}_{a,t}, n^{(e)}_{a,t})\]

The remainder of the virus rate model is identical to the AGE model, with a binomial sampling model specified for the number of enrolled cases, with the weighting factor (specific to the setting, as above) determining its proportion:

\[n^{(v)}_{a,t} \sim \text{Binomial}(w^{(v)}_{a,t}, \pi_y)\]

The total number of cases in the population \(w^{(v)}_{a,t}\) is bounded below by the number of enrolled cases and above by the subgroup population in Davidson county:

\[w^{(v)}_{a,t} \sim Uniform(n^{(v)}_{a,t}, n_{a,t})\]

The data likelihood for the total population count was modeled as a Poisson random variable, governed by the age-, year- and virus-specific rate parameter \(\lambda^{(v)}_{a,t}\):

\[w^{(v)}_{a,t} \sim \text{Poisson}(\lambda^{(v)}_{a,t} n_{a,t})\]