AbstractLorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enimad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Miscibility gap detection is a crucial feature in thermodynamic calculation software to accurately calculate the energy of phases containing regions of compositional instability, i.e., spinodals, and is commonly handled through global minimization (GM) of the Gibbs energy.

Inside the spinodal region, all points on the energy surface have negative curvature and will be thermodynamically driven towards demixing.

The cause of miscibility gaps in non-ideal solutions is the presence of unfavorable interactions between components that overwhelm the entropically-driven ideal mixing contribution to the Gibbs energy.

From a computational perspective miscibility gaps pose a challenge because they mean that the same phase may appear on multiple points on the equilibrium tie hyperplane but at different compositions.

In these cases the software must increase the total degrees of freedom by creating multiple composition sets of the same phase, potentially up to the limit specified by the Gibbs phase rule.

For multi-component systems the topology of the energy surfaces can become quite complex, since high-dimensional tangent hyperplanes can interact with the energy surface.

Moreover, when handling phases with internal degrees of freedom, i.e., sublattices, it is possible for points on the global energy surface (the composition) to be close together while being far apart in their internal coordinates (the constitution).

This is referred to as an internal miscibility gap and is the cause of order-disorder phase transitions.

While few authors have discussed fully generalized GM schemes, there has been work in efficient sampling in low dimensions (Emelianenko 2006) and tie hyperplane calculation for the multi-component case (Perevoshchikova 2012).

However, there has not previously been detailed discussion of methods for solving the multi-component case with multiple sublattices.

Reliable convergence to the global energy minimum also requires a good choice of starting points, i.e., phase compositions, for the minimization procedure.

Determinability is a desirable property from a reproducibility perspective because it ensures that, in principle, users with the same version of the software

will always converge to exactly the same solution.

Quasi-random sampling is a technique that attempts to combine the statistical advantages of pseudo-random sampling with a fully deterministic algorithm.

A quasi-random sequence is a deterministic sequence of "random-like" values that can be generated by an algorithm.

Unlike psuedo-random number generators, quasi-random sequences are explicitly designed for

What makes quasi-random sequences superior to pseudo-random numbers for sampling is that one can construct quasi-random sequences which are biased toward "spreading out," i.e., covering more of the domain.

The Halton sequence is a quasi-random sequence of numbers generated over \((0,1)\) using a prime number as a base (Halton 1964).

For \(N \lt 5\), where \(N\) is the number of degrees of freedom, the \(N\)-dimensional Halton sequence is low-discrepancy, meaning it is very similar to the uniform distribution in terms of domain coverage over \((0,1)^N\).

For larger \(N\), a scheme for permuting the coefficients should be used to prevent linear correlations between dimensions (Hess 2003, Chi 2005). Because values in the Halton sequence can be computed out of order, mesh refinement routines can continually draw from the sequence by keeping track of the index of the last computed member.

The Halton sequence alone does not satisfy our purposes for sampling composition

space because it samples the \(N\)-hypercube, implying that all \(N\) dimensions are independent, but each sublattice's degrees of freedom must sum to unity in order to obey the site fraction balance constraint. Instead the \((N-1)\)-simplex must be sampled within each sublattice.

The most straightforward approach is to normalize each generated point so that each sublattice's coordinates sum to unity, but this will tend to oversample the middle of the simplices (Figure \ref{fig:normalized}). Woronow (Woronow 1993) showed that uniformly distributed variates on the \((N-1)\)-simplex follow a \(N\)-dimensional symmetric Dirichlet distribution, which for the case of the full composition range simplifies to \(N\) samples from a standard exponential distribution divided by their sum.

Because the Halton sequence has low discrepancy, it can be substituted for the uniform distribution. Exponentially distributed variates can be generated with the inverse transformation \(-\ln(U)\), where \(U\) is a uniform random or quasi-random variate.

For phases with multiple sublattices, all degrees of freedom are sampled independently and then each sublattice's degrees of freedom grouped and normalized to unity.

This is the approach used in the present work and is summarized as follows.

- Draw the first \(K\) points from a \(N\)-dimensional scrambled Halton sequence using the first \(N\) primes. Store the result in a \(K \times N\) matrix, \(M\).
- Compute \(-\ln(M)\) elementwise and store as the new \(M\).
- Group together columns corresponding to each sublattice. Compute the rowwise sum of each group and divide each element by its corresponding group's sum.

All end-members for each phase are also appended to the result of the sampling algorithm, with the exception of pure-vacancy end-members.

Another approach called rejection sampling, which involves sampling the \(N\)-hypercube and throwing out all points that fall outside the \((N-1)\)-simplex, is not considered in this work.

The problem is that the fraction of rejected points quickly increases with \(N\), making it very inefficient for multi-component systems.