ROUGH DRAFT authorea.com/100324
Main Data History
Export
Show Index Toggle 0 comments
  •  Quick Edit
  • MODESTA Machine Project Specifications

    Image segmentation is the process of logically dividing the image into multiple parts. The goal is to locate objects of interest and extract them from the rest of the image. More specifically, it assigns logical labels to each of the pixels such that pixels belonging to the same label share similar characteristics. There are many techniques to perform image segmentation some of which are theshold based, color based, and texture based. Your task is to implement an automatic threshold-based image segmentation algorithm called Otsu’s Method.

    Otsu’s Method

    The simplest property that pixels in a region can share is intensity. Therefore, a natural way to segment such regions is to find a threshold that would separate the light and dark regions. This creates binary images from gray-level ones by turning all pixels below some threshold to zero (background pixels) and all pixels above that threshold to one (foreground pixels).

    A simple way to find a suitable threshold is to find each of the modes (local maxima) and then find the valley (minimum) between them. One way to think of it is to consider the values in the two regions as two clusters. Our goal is to make each cluster as tight as possible, thus minimizing the overlap.

    We can’t change the distribution but we can adjust where we separate them (the threshold). As we adjust the threshold one way, we increase the spread of one and decrease the spread of the other. The goal then is to select the threshold that minimizes the combined spread.

    We can define the within-class variance as the weighted sum of the variances of each cluster:

    \[\sigma^2_{\text{Within}}(T) = n_B(T)\sigma^2_B(T)+n_O(T)\sigma^2_O(T)\]

    where

    \[n_B(T) = \sum_{i=0}^{T-1}p(i)\] \[n_O(T) = \sum_{i=T}^{N-1}p(i)\] \[\sigma^2_B(T) = \text{the variance of the pixels in the background (below the threshold)}\] \[\sigma^2_O(T) = \text{the variance of the pixels in the foreground (above the threshold)}\]

    and \([0,N-1]\) is the range of intensity levels.

    Computing this within-class variance for each of the two classes for each possible threshold involves a lot of computation, but there is an easier way.

    If you subtract the within-class variance of the combined distribution, you get something called the between-class variance:

    \[\begin{aligned} \sigma_{\text{Between}}(T) &= \sigma^2 - \sigma^2_{\text{Within}}(T) \\ & = n_B(T)[\mu_B(T)-\mu]^2+n_O(T)[\mu_O(T)-\mu]^2\end{aligned}\]

    where \(\sigma^2\) is the combined variance and \(\mu\) is the combined mean. Notice that the between-class variance is simply the weighted variance of the cluster means themselves around the overall mean. Substituting \(\mu = n_B(T)\mu_B(T)+n_O(T)\mu_O(T)\) and simplifying, we get \[\sigma^2_{\text{Between}}(T) = n_B(T)n_O(T)[\mu_B(T)-\mu_O(T)]^2\]

    So, for each potential Threshold \(T\) we

    1. Separate the pixels into two clusters according to the threshold.

    2. Find the mean of each cluster.

    3. Square the difference between the mea