Authorea

Computing this within-class variance for each of the two classes for each possible threshold involves a lot of computation, but there is an easier way.

If you subtract the within-class variance of the combined distribution, you get something called the between-class variance:

\[\begin{aligned} \sigma_{\text{Between}}(T) &= \sigma^2 - \sigma^2_{\text{Within}}(T) \\ & = n_B(T)[\mu_B(T)-\mu]^2+n_O(T)[\mu_O(T)-\mu]^2\end{aligned}\]

where \(\sigma^2\) is the combined variance and \(\mu\) is the combined mean. Notice that the between-class variance is simply the weighted variance of the cluster means themselves around the overall mean. Substituting \(\mu = n_B(T)\mu_B(T)+n_O(T)\mu_O(T)\) and simplifying, we get \[\sigma^2_{\text{Between}}(T) = n_B(T)n_O(T)[\mu_B(T)-\mu_O(T)]^2\]