IMA 201 multifocus


The goal of this project is to use multiple images of the same scene with different focuses to create one image of this scene that combines the most focused version of each element of the scene. The first part of this project was the implementation of the algorithm described in (Fedorov 2006), the second part has been to use the first implementation as an inspiration to design my own algorithm to perform the multi-focus imaging task.

The original algorithm

The Laplacian pyramid

Reduction and expansion

We first need to defined upsampling and downsampling operators for an image \(I\) of size \(m\times n\):

\begin{equation} \mathrm{down}(I)(i,j)=I(2i,2j)\nonumber \\ \end{equation} \begin{equation} \mathrm{up}(I)(2i,2j)=4I(i,j)\nonumber \\ \end{equation} \begin{equation} \mathrm{up}(I)(2i+1,2j+1)=0\nonumber \\ \end{equation}

We can now define the reduction and expansion operators, where \(k\) is the kernel of a low-pass filter:

\begin{equation} \mathrm{reduce}(I)=\mathrm{down}(k\ast I)\nonumber \\ \end{equation} \begin{equation} \mathrm{expand}(I)=k\ast\mathrm{up}(I)\nonumber \\ \end{equation}

Gaussian pyramid

The Gaussian pyramid \(G\) of an image is a sequence of images \(G_{0},\ldots,G_{N}\) where \(G_{0}\) is the original image and \(G_{l}=\mathrm{reduce}(G_{l-1})\) for \(l\geq 1\). Intuitively, each level of the pyramid eliminates the finest details of the previous level and keeps only the coarse information.

Laplacian pyramid

The Laplacian pyramid \(L=L_{0},\ldots,L_{N}\) of an image is derived from its Gaussian pyramid \(G\). The top level is defined by \(L_{N}=G_{N}\). The next levels are defined by \(L_{l}=G_{l}-\mathrm{expand}(G_{l+1})\). Intuitively, each level of the pyramid represents only the details that have a frequency that can first be observed at the corresponding scale.

By construction, the original image (or any level of the Gaussian pyramid) can be reconstructed from the Laplacian pyramid:

\begin{equation} I=\sum\limits_{l=0}^{N}\mathrm{expand}^{l}(L_{l})\nonumber \\ \end{equation}

Multi-resolution spline

This technique, described in (Burt 1983), aims at merging two images \(I_{A}\) and \(I_{B}\) seamlessly along a mask \(M\). This is done by using the Laplacian pyramids \(L_{A}\) and \(L_{B}\) of the two images and the Gaussian pyramid \(G_{M}\) of the mask.

Using these elements, we can build a new Laplacian pyramid \(L_{S}\) using the following formulas (\(\odot\) is the element-wise product):

\begin{equation} {L_{S}}_{l}={G_{M}}_{l}\odot{L_{A}}_{l}+\left(1-{G_{M}}_{l}\right)\odot{L_{B}}_{l}\quad\text{for}\quad l=0\ldots N\nonumber \\ \end{equation}

Using the reconstruction property of the Laplacian pyramid, we can build the merged image:

\begin{equation} I_{S}=\sum\limits_{l=0}^{N}\mathrm{expand}^{l}({L_{S}}_{l})\nonumber \\ \end{equation}