Daniel Stanley Tan edited Computing_this_within_class_variance__.tex  over 8 years ago

Commit id: 10cb0d5937416c32c1d6e18e8076e4e2f9130012

deletions | additions      

       

\begin{align}  \sigma_{\text{Between}}(T) &= \sigma^2 - \sigma^2_{\text{Within}}(T) \\  & = n_B(T)[\mu_B(T)-\mu]^2+n_O(T)[\mu_O(T)-\mu]^2  \end{align} where \sigma^2 is the combined variance and \mu is the combined mean. Notice that the between-class variance is simply the weighted variance of the cluster means themselves around the overall mean. Substituting $\mu = n_B(T)\mu_B(T)+n_O(T)\mu_O(T)$ and simplifying, we get  $$\sigma^2_{\text{Between}}(T) = n_B(T)n_O(T)[\mu_B(T)-\mu_O(T)]^2$