Authorea

Yingyi edited bootstrap.tex over 9 years ago

Commit id: a4bb5815fdd46e43413ad208e4ff97f1cd493b3e

deletions | additions

The GMM introduced in Section~\ref{methods-gmm} can constrain the best-fit parameters for the given modes, but the best-fit parameters give no hints to the goodness of the bimodal distribution against the unimodal one explicitly. So we need to use the bootstrap method (Efron 1979) to test the hypothesis of the bimodal distribution. The basic idea of this method is to generate a simulated sample from the original data and re-do the estimation. Then some statistics can be determined, e.g. the probability of certain modes and the errors of the parameters, by comparing the re-estimated results with the original ones. In this project, we imply two kinds of bootstraps, i.e. the parametric bootstrap and non-parametric bootstrap methods. We use the parametric bootstrap to estimate the probability of the unimodal distribution, i.e. the $p$-value of $-2 \ln \lambda$, $D$ and $kurtosis$. In this case, the test sample is generated from the unimodal Gaussian distribution $N(\mu, \sigma^2)$ fitted from the original data by GMM. By keeping the same data size, we repeat the bootstrap for a large number of times (1000 as default, but 100 when the data size is larger than 500). Then we count the number of repeats that \begin{eqnarray} \label{eq:pboot} {(-2 \ln \lambda)}_{i, \rm bootstrap} &\leq& -2 \ln \lambda, \\ {D}_{i, \rm bootstrap} &\leq& D, \\