Authorea

Volker Strobel added section_Map_evaluation_label_sec__.tex almost 8 years ago

Commit id: fae8f32d9f38954a2286356fff7e5444ca59d52a

deletions | additions

\section{Map evaluation} \label{sec:mapeval} \subsection{Synthetic Data Generation} \label{sec:syntheticdatageneration} In the scope of this thesis, an application to simulate different camera positions during flight was created. It generates synthetic image patches based on perspective transformations of a map image. Examples of generated images are displayed in Figure~\ref{fig:montage}. The application allows for comparing and predicting the performance of different maps. The software is written in \CC{} and OpenCV~3.0.0. The software is able to generate a specified amount of image patches using random values for rotational angles, translational shifts, as well as blur, contrast and brightness intensity. The values are sampled from different probability distributions, see Table~\ref{tab:distributions} for a summary. An additional graphical user interface (GUI) displays the result of applied transformations and saves the generated images. To simulate camera movements in 3D space, a 2D to 3D projection of the image is performed first. Then, by building separate rotation matrices $R_x$, $R_y$, and $R_z$ around the axes $x$, $y$, and $z$, the rotations can be performed separately. Next the rotation matrix $R$ is created by multiplying the separate matrices, i.e., $R = R_x \times R_y \times R_z$. The 3D translation matrix is multiplied by the transposed rotation matrix. This step is crucial to rotate the \emph{camera model} and not the image itself. Finally, after performing all steps, a projection from 3D space to 2D is applied, to obtain the transformed image. \begin{table}[h!] \centering \begin{tabular}{llllll} \toprule \multicolumn{6}{c}{Distribution} \\ \multicolumn{3}{c}{Uniform ($\mathcal{U}$)} & \multicolumn{3}{c}{Normal ($\mathcal{N}$)} \\ \cmidrule(r){1-3}\cmidrule(r){4-6} Parameter & Min & Max & Parameter & M & STD \\ \cmidrule(r){1-3}\cmidrule(r){4-6} Yaw & $0$ & $360$ & Roll & $90$ & $3$ \\ Translation X & $100$ & $500$ & Pitch & $90$ & $4$ \\ Translation Y & $100$ & $500$ & Brightness & $2$ & $0.1$ \\ Height & $100$ & $700$ & & & \\ Blur & $1$ & $10$ & & & \\ Contrast & $2$ & $3$ & & & \\ \bottomrule \end{tabular} \caption[Distributions for the different parameters of the synthetic data augmentation tool.]{The table shows the used distributions for the different parameters of the synthetic data augmentation tool.} \label{tab:distributions} \end{table} \begin{figure}[h!] \begin{center} \includegraphics[width=0.7\columnwidth]{figures/montage/default_figure} \caption{{\label{fig:montage} 16 example images generated by the synthetic data generation tool. The black parts are the parts beyond the image borders of the underlying image (i.e., the part where pixel maps to) or the ones that could not be identified during the mosaic making process.% }} \end{center} \end{figure} \subsection{Evaluation Scheme} \label{sec:evaluationscheme} The performance of a method might largely depend on the environment it is used in. The evaluation of a map is difficult, since the obtained histograms during a real flight depend on many factors: motion blur, distance to the map and rotations proportional to the map. Therefore, we propose an initial evaluation scheme for given maps. This scheme assigns a global fitness value to a given map, proportional to the expected accuracy if it is used in the physical world. Additionally, it allows to inspect the given map and detect the regions that are responsible for the overall fitness value. In the first step of the map evaluation procedure, $N$ different patches of a given map are generated using the tool \emph{draug} (Section~\ref{sec:draug}). We propose the following loss function ($L$) for evaluating a given map ($\mathcal{M}$): \begin{align} L(\mathcal{M}) &= \sum_{i = 1}^{N} \sum_{j = 1}^{N} \ell(d_a(h_i, h_j), d_e(h_i, h_j)) \end{align} \begin{align} \ell(x, y) &= x - y\\ d_a(h_i, h_j) &= \text{cosine\_similarity}(h_i, h_j)\\ d_e(h_i, h_j) &= f_X(pos_i) = f_X(x_i, y_i)\\ \end{align} \begin{align} \mu = pos_j = (x_j, y_j)\\ \Sigma = \begin{bmatrix} \rho & 0\\ 0 & \rho\\ \end{bmatrix} \end{align} The idea behind the function $L$ is that histograms in closeby areas should be similar and the similarity should decrease the further away two positions are. This is modeled as a 2-dimensional Gaussian with 0 covariance (Figure~\ref{fig:model}). The variance is depended on the desired accuracy ($\rho$): the lower the variance, the more punctuated a certain location is but also the higher the risk that a totally wrong measurement occurs. The following visualization are based on color histograms (and not texton histograms) for easier visual analysis. \begin{figure}[h!] \begin{center} \includegraphics[width=0.7\columnwidth]{figures/model-crop/default_figure} \caption{{\label{fig:model} Ideal histogram similarity for a given position. Histograms taken at positions close to $\textbf{x} = (400, 300)$ should be similar to this histogram. The further away the position of a certain histogram, the lower the ideal similarity should be.% }} \end{center} \end{figure} \begin{figure}[h!] \begin{center} \includegraphics[width=0.7\columnwidth]{figures/mosaic_enlarged/default_figure} \caption{{\label{fig:map} This figure shows a map with a repeating pattern: two yellow rectangles.% }} \end{center} \end{figure} \begin{figure}[h!] \begin{center} \includegraphics[width=0.7\columnwidth]{figures/repeating-crop/default_figure} \caption{{\label{fig:repeating} Repeating map. Heatmap colors shows losses per region (smoothed using a Gaussian filter). Mean loss: 67% }} \end{center} \end{figure} In future research, the ``bad regions'' could be optimized using an optimization approach such as an evolutionary algorithm. It also allows to show the similarities for a fixed position, and the loss for a fixed positions based on the expected similarity and the actual similarity.