Markov chain Monte Carlo:
The Markov chain Monte Carlo (MCMC) is based on a random walk through
parameter space where new steps are either rejected or accepted on the
basis of maximizing the likelihood of the model to fit our dataset (here
ESPAT profiles). A MCMC realization includes a burn-in period where
successive iterations lead to a subdomain of the parameter space close
to the most likely solution. After the burn-in period, the algorithm
samples randomly the parameter distributions around the best solution.
While there is no definitive way to determine the convergence of an MCMC
model, one of the diagnostics that can be employed that may suggest
convergence are trace plots and post burn-in marginal posteriori
distributions and covariance plots. Plotting this results in distinctive
trace or “caterpillar” plots where the model samples a given parameter
until it converges upon the solution space. The trace plots and post
burn-in marginal posteriori distributions and covariance plots for each
of the DMDs investigated are provided in figure s3 -4 .
The calculation of the error between the observed ESPAT, f , and
that of the model, p , is calculated as:
\(L_{k}=\sum{(\mathbf{f}-\mathbf{p}_{k})}^{2}\) (s6)
where k is the kth step in the Monte Carlo chain. The
calculation of p is performed based on m ’, a
perturbation to an initial proposal of model parameters written in
vector form as m (Eq. (1) & (16) in the main text). The new
candidate vector, m ’, is then is accepted in the chain and
saved as m for the next step if the criteria below is met:
\(\zeta>log(\varepsilon)\) (s7)
where \(\varepsilon\) is a number sampled from a random uniform
distribution between 0 and 1 and \(\zeta\) is defined as:
\(\zeta=\ \gamma\varphi\) (s8)
where \(\gamma\) is an arbitrary pre-factor and \(\varphi\) is defined
as:
\(\varphi=\ -L_{k-1}+\ L_{k}\) (s9)
In this way, if the error is lowered between successive steps, the new
candidate vector is accepted as the next step in the chain. If the error
is not lowered, there is still a probability that the candidate vector
will be accepted in order to escape local extrema. The acceptance rate,
defined as the percent of accepted candidate vectors m ’ during
the duration of the Markov chain, optimizes the MCMC inversion when
between values of ~20% for higher dimensional problems
[Roberts et al. , 1997], which can be achieved by adjusting
the pre-factor \(\gamma\) until a proper acceptance rate is achieved. A
similar method was also used to perform MCMC inversions for the fitting
of data from Saal et al. [2008] in order to constrain the
evaporation and cooling rates [Roberts et al. , 1997].