Authorea

Glaziou edited subsection_Disaggregation_of_incidence_subsubsection__.tex over 8 years ago

Commit id: 2204c642f44ad144a45982af1854fa3cf1236ad9

deletions | additions

\subsection{Disaggregation of incidence} \subsubsection{HIV-positive TB incidence} In this report, TB incidence is disaggregated by HIV-infection status at country level. TB incidence was disaggregated by HIV and CD4 status using the Spectrum software\cite{Stover2012}. WHO estimates of TB incidence were used as inputs to the Spectrum HIV model. The model was fitted to WHO estimates of TB incidence, and then used to produce estimates of TB incidence among people living with HIV disaggregated by CD4 category\cite{Pretorius2014}. A regression method was used to estimate the relative risk (RR) for TB incidence according to the CD4 categories used by Spectrum for national HIV projections\cite{J2010}. Spectrum data were based on the national projections prepared for the UNAIDS Report on the global AIDS epidemic 2012. 2013. The model can also be used to estimate TB mortality among HIV-positive people, the resource requirements associated with recently updated guidance on ART and the impact of ART expansion. A flexible and relatively simple way of modelling TB incidence (or any time-dependent function) is to represent it as k time-dependent m’th order cubic-spline functions:

\end{align*} where $\beta_i$ is the i'th spline coefficient and $B^m i(x)$ represents the evaluation of the i-th basis function at time (year) $x$. The order of each basis function is m and cubic splines are used, i.e. $m=3$. The equation simply states that any time-dependent function, such as incidence, can be represented as a linear combination of cubic-spline basis functions. The values of the cubic-spline coefficientsβwere coefficients $\beta$ were determined by an optimization routine that minimizes the least squares error between incidence data ($I_{obs}$) and the estimated incidence curve $I(x)$: \begin{align*} I(x) = \sum_{x = 1990 .. 2012} | I(x) - I_{obs} (x)|^2 + \lambda \beta^T S \beta

Here $|I - I_{obs}|^2$ is the sum of squared errors in estimated incidence and $S$ is a difference penalty matrix applied directly to the parameters $\beta$ to control the level of variation between adjacent coefficients of the cubic-spline, and thus control (through a choice of $\lambda$) the smoothness of the time-dependent case incidence curve. Another important purpose of the use of the smoothness penalty matrix $S$ is to regularize (by creating smoothness dependencies between adjacent parameters) the ill-conditioned inverse problem (more unknown parameters than the data can resolve) that would tend to over fit the data when left ill-conditioned. The cubic-spline method was then used to fit indicators (incidence, case notifications, etc.) an indicator (incidence or notification) to a set of bootstrapped data, obtained by sampling from the normal error distribution resulting from fitting with zero mean and a standard deviation of the ‘point estimate’. residuals of the spline regression. This bootstrap method produces produced a sample of projected cubic-spline curves that are practically equivalent to a set that would be obtained from fitting the model to inherits the same number of repeated measurements (or assessments) temporal biases and systematic errors of the given indicator. data. Confidence intervals based on the bootstrapped data are data, namely $2.5^{th}$ and $97.5^{th}$ percentile of projected cubic splines, were typically narrow in the years where the model has had data to utilize, and ‘spread out’ after that, according to a Gaussian process with an a linearly increasing variance. The disaggregation of TB incidence by CD4 category among people living with HIV was based on the idea that an increase in the relative risk for TB incidence is a function of CD4 decline. Williams et al captured this idea in a model for the relationship between the RR for TB and CD4 decline.\cite{20974976} They suggested a 42\% (+/- 17\%) increase in RR for TB for each unit of $100\mu l$ CD4 decline. The Spectrum-TB model’s disaggregation method is based on the Williams et al. model. The model first estimates incidence among people living with HIV, and then calculates the ‘risk of TB’ $F=I^- / P^-$, where $I^-$ is TB incidence among people living with HIV and $P^-$ is the number of people living with HIV who are susceptible to TB. An assumption is made that the risk of TB infection among people living with HIV with CD4 count > $>$ 500 μL is proportional to F (it was assumed that it was higher by a factor of 2.5).\cite{15609223} For each $100\mu l$ CD4 decline in the remaining categories (350-499, 250-349, 200-249, 100-199, 50-99 CD4 cells/μL, and CD4 count less than 50 cells μL), the risk of infection is represented as: \begin{align*} F(c<500) = F(c>500).p(1).p(2)dc

The RR model approach to estimation of TB incidence was used for people on ART. Although an estimate of TB incidence among people on ART could be obtained from surveillance data reported to WHO (such that it is arguably not necessary to use the RR model), limitations of the ART data (in particular that some countries appear to report cumulative totals of people on ART) meant that the RR approach needed to be used. Hazard ratios (HR) of 0.35 were assumed for all CD4 at ART initiation categories. Suthar et al have reported HRs of 0.16, 0.35 and 0.43 for those on ART with CD4 count < $<$ 200, 200-350 and > $>$ 350,\cite{22911011} and these values could in principle be used. However, Spectrum tracks only CD4 at initiation, thus limiting the use of CD4-specific HRs for people on ART. It was further assumed that the HR of 0.35 applies only to patients on ART for more than six months. Spectrum’s ART-mortality estimates, derived mostly from ART cohorts in Sub-Saharan Africa, suggest that mortality remains very high in the first six months of ART. Since TB is a leading contributor to mortality among HIV-positive people, it was judged that the HR for patients on ART for 0‒6 months is likely to remain high; therefore, a reduction factor due to ART was not applied for this subset of patients.

Model testing showed that using two replicates of the HIV survey data (i.e. duplicating the survey data) and two replicates of the routine testing data with coverage greater than 90\% was the best approach to disaggregating TB incidence: the fit passed close to the survey or high-coverage routine testing data points that were available. For each of a) HIV sentinel and b) routine testing with coverage between 50–90\%, data were not used. A prototype Bayesian importance sampling (IMIS) algorithm was developed to handle complex data weighing possibilities, but it was As with the main indicators, confidence intervals are based on subjective priors a bootstrap method. For this method all data sources are sampled from assumed underlying distributions: total incidence data is sampled from normally bootstrap result set for total incidence, survey and likelihood functions sentinel HIV data are sampled from beta distributions and is more time-consuming to run than simple least squares. For routine HIV testing among reported cases are sampled from a normal distribution, using the purposes variance ofproducing estimates for all countries automatically, the least squares method was used. In future, least squares and IMIS fitting could be made available to sample number of the end user. number who tested HIV positive. For countries with no data, a range for p(2) was estimated from countries with survey or testing data, which suggest that $p(2) = 1.96 [1.8-2.1]$. The RR-model was then fitted to total TB incidence only. There is no satisfactory way to verify results for TB incidence among people living with HIV when no HIV-testing data are available. However, comparison of the global estimate for TB incidence among people living with HIV produced by Spectrum and estimates based on a different method using HIV prevalence instead of CD4 distributions and using HIV-test data in a different way) suggests that the RR-model works reasonably well. The alternative comparative method to disaggregate TB incidence by HIV is derived as follows, where the $I$ and $N$ denote incident cases and the total population, respectively, superscripts + and - denote HIV status, $t$ is the prevalence of HIV among new TB cases, $h$ is the prevalence of HIV in the general population and $\rho$ is the incidence rate ratio (HIV-positive over HIV-negative). \begin{equation*} \begin{align*}