Statistical Analysis
Our basic framework is a cluster-randomized evaluation, with clusters
specified at the level of the hamlet. A key issue for inference from
such a design is to account for non-independence among observations
from the same cluster when estimating the impact of the intervention
\cite{Hayes:2009it}. We focus here on the general plan for analysis
of survey results, including descriptive tables and basic model specifications.
Descriptive tables
We will generate a descriptive table of measures of central tendency
and dispersion for the main outcome variables and the primary covariates
specified above, and provide both cluster-level summaries and individual-level
summaries \cite{Campbell:2004fj}. Continuous variables will be presented
as means and standard deviations, and categorical variables as proportions
and standard deviations. We will also provide a breakdown by our main
stratification variable, block. We will refrain from conducting statistical
tests of balance across treated and control groups \cite{Senn:1994qv}.
Sample attrition and missing data
Given the relatively high initial response rate for our baseline survey
(~90%), relatively short follow-up, and generally
limited migration in the study area, we anticipate little study attrition.
Even for those observed at follow-up it is also possible that some
primary or secondary outcome data could be missing, although we anticipate
relatively little item non-response, based on the baseline survey
results. Regardless, sample attrition and missing data are common
in virtually all surveys and we describe briefly our plan for dealing
with these issues. If the proportion of missing outcome data is relatively
small (e.g., ¡5%) we will proceed with a complete case analysis.
Regardless of the amount of missing data, we will describe the reason
why subjects were lost to follow-up and why outcomes are missing \cite{NAP:2010qc,Groenwold:2014bq}.
In addition to reporting baseline characteristics we will also include
characteristics of those who were lost to follow-up or were missing
outcome or covariate data. For amounts of missing data that may be
more severe we will perform sensitivity analysis to quantify the effect
of missing outcome data on our effects of interest (e.g. multiple
imputation or inverse probability weighting, or estimating bounds
for our treatment effects) \cite{Manski:1990fr,Lee:2009aa,White:2011aa}.
Main model specifications
The primary specification will be an intent-to-treat (ITT) model including
adjustment for stratification by block. For continuous outcomes we
will use linear model of the form:
\begin{equation}
y_{ij}=\alpha+\beta Z_{j}+\sum_{k=1}^{K}\gamma_{k}K_{jk}+\varepsilon_{ij}\\
\end{equation}
where \(y_{ij}\) is the outcome of interest, \(Z_{j}\) is the main treatment
assignment variable (1 for intervention, 0 for control), \(\gamma_{k}\)
are coefficients for each of the \(K\) blocks on which we stratified
randomized treatment assignment, and \(\varepsilon_{ij}\) is an error
term. To account for non-independence of errors among observations
in the same hamlet we will use cluster-robust variance estimators
in all models \cite{Hayes:2009it,Cameron:2015aa}. The coefficient
\(\beta\) and its 95% confidence interval will be our primary estimate
of interest. Equation 1 above represents our unadjusted model, but
in order to increase precision we may include additional baseline
covariates that are strong predictors of the outcome. It is often
the case that one of the strongest predictors of post-treatment outcome
is pre-treatment outcome measured at baseline, and we will include
this as well:
\begin{equation}
y_{ij}=\alpha+\beta Z_{j}+\sum_{k=1}^{K}\gamma_{k}K_{jk}+\sum_{m=1}^{M}\delta_{m}\mathbf{C}_{m}+\varepsilon_{ij}\\
\end{equation}
where now \(\mathbf{C}\) is a vector of pre-treatment covariates, including
a measure of the outcome \(y_{ij}\) at baseline. In particular, including
measures of the outcome at baseline can lead to meaningful increases
in power, especially for outcomes that are weakly correlated over
time \cite{Frison:1992rt,McKenzie:2012yu}. For continuous outcomes
(e.g., empowerment scale) we will use linear regression. For binary
outcomes (e.g., maternal employment) we will use logistic regression
and report marginal effects on the absolute risk scale. For count
outcomes (e.g., reports of symptoms of distress) we will use either
Poisson or negative binomial regression and report marginal estimates
of the expected number of events. For all models we will use cluster-robust
standard errors, clustered at the hamlet level \cite{Cameron:2015aa}.
Heterogeneous treatment effects
Prior literature has found some evidence that access to center-based
child care had stronger impacts on mothers who may have been more
disadvantaged \cite{Barros:2011qr,Angeles:2014bh}. Therefore, we
will assess whether there are any differences in the program impact
according to the following maternal characteristics: 1) Current work
status (whether or not women were working at baseline); 2) Education
(less than primary vs. primary or greater); 3) Household structure
(e.g., whether there is another adult caregiver in house); and 4)
Measures of women’s empowerment at baseline (continuous
latent scale from confirmatory factor analysis). Additionally, we
will test whether the program has any differential impact by our main
geographic stratification variable (block). In order to test whether
the program effects may differ by other individual characteristics,
we will extend the generalized linear model specified with covariates
(including block) above (Equation 2) to allow for heterogeneity in
the impact of the treatment:
\begin{equation}
y_{ij}=\alpha+\beta Z_{j}+\sum_{k=1}^{K}\gamma_{k}K_{jk}+\sum_{m=1}^{M}\delta_{m}\mathbf{C}_{m}+\theta\left(Z_{j}\times C_{ij}\right)+\varepsilon_{ij}\\
\end{equation}
where the coefficient \(\theta\) on the product term between treatment
assignment (\(Z_{j}\)) and a specific covariate \(C\) provides a test
of whether the program’s impact is homogenous across
levels of \(C\). The other terms in Equation 3 are defined analogously
to Equation 2, though the inclusion of the product term affects their
interpretation. Based on prior research we anticipate stronger treatment
effects for women who were not working at baseline, with low education,
without another adult caregiver in the household, and with higher
baseline levels of empowerment.
Multiple comparisons
Our primary analytic goal for this study is to estimate treatment
effects and measures of precision (i.e., confidence intervals) for
the intervention, and we will generally avoid null hypothesis significance
testing (with the exception of tests for heterogeneity as described
above). However, because daycare is a complex intervention that may
plausibly affect a number of outcomes for both mothers and children,
in order maintain an overall Type 1 error rate we will make some adjustments
for testing the impact of the intervention across many outcomes \cite{Schochet:2008db,Anderson:2012nx,Glennerster:qf}.
To test whether the intervention has general impacts on a given domain
with multiple indicators (e.g., empowerment, employment, income/savings,
time use) we will create a summary index for each domain. For related
outcomes within a given domain we will take the approach of creating
standardized effect estimates for each of the subdomains, averaging
them, and calculating a summary estimate that accounts for testing
multiple outcomes \cite{OBrien:1984xe,Kling:2007fv}. The use of an
index is beneficial in this case since probability of a false rejection
does not increase as additional outcomes are added to a summary index.
Moreover, an indexed outcome can provide a statistical test for whether
a program has a “general effect”
on a set of outcomes.
Using an index, however, will not provide us with inference on specific
sub-components within a given domain (e.g., a women’s decision-making
ability regarding her own health care in the empowerment domain).
Within a given domain we will report both unadjusted \(p\)-values for
individual outcomes and “family-wise”
\(p\)-values adjusted to account for the multiple outcomes examined
within a domain \cite{Kling:2007fv,Finkelstein:2012ye}. To avoid
the losses of power associated with simple Bonferroni-type adjustments
we will use the free step-down method of Westfall and Young \cite{Westfall:1993qr}.
Finally, because many of our tests for heterogenous effects are exploratory,
we will use the less conservative strategy of Benjamini and Hochberg
\cite{Benjamini:1995le} account for multiple testing across the standardized
domain outcomes.
Estimating the impact of use of daycare on maternal outcomes
All of the prior analyses will estimate the impact of random assignment
to daycare. However, we are also interested in estimating the impact
of actual receipt of daycare on maternal and child outcomes. For these
analyses the basic identifying assumption is that random assignment
to daycare only affects outcomes through its impact on the actual
take-up and use of daycare. Equation 1 above may be considered as
the ITT or reduced form equation for the impact of treatment assignment,
and we will use two-stage least squares (2SLS) instrumental variables
analysis to estimate the impact of daycare on maternal outcomes. We
will proceed by estimating the first stage equation, which gives the
impact of randomized treatment assignment on daycare:
\begin{equation}
D_{ij}=\alpha+\beta Z_{j}+\sum_{k=1}^{K}\gamma_{k}K_{jk}+\sum_{m=1}^{M}\delta_{m}\mathbf{C}_{m}+\varepsilon_{ij}\\
\end{equation}
where \(D_{ij}\) now represents whether or not mother reports enrolling
her child in the balwadi program and the other parameters are defined
similarly as in Equation 1 above. We use the coefficients from Equation
3 to predict, for each observation, the probability \(\hat{D}_{ij}\)
of enrolling in the balwadi program, which is then used in the second
stage equation:
\begin{equation}
y_{ij}=\zeta+\eta\hat{D}_{ij}+\sum_{k=1}^{K}\theta_{k}K_{jk}+\sum_{m=1}^{M}\lambda_{m}\mathbf{C}_{m}+\varepsilon_{ij}\\
\end{equation}
where \(y_{ij}\) is the outcome of interest, \(\hat{D}_{ij}\) is the
predicted probability of daycare use derived from Equation 4 and the
coefficient. We will use software to correct standard errors for the
fact that \(\hat{D}_{ij}\) is predicted in the first stage \cite{Angrist:1996rm,Angrist:2008vk}.