Authorea

Volker Strobel edited DeclareMathOperator_argmin_arg_min_DeclareMathOperator__1.tex almost 8 years ago

Commit id: 7d65722b600150478be40270c8424541cccdd082

deletions | additions

\State \Call{Send\_Pos}{$x_t, y_t$} \EndProcedure \end{algorithmic} \end{algorithm}\section{Filtering and Smoothing} \label{sec:filtering} Computer vision-based estimations are usually noisy or ambiguous, and so is the proposed system: beginning with the estimations of the homographies, the ground truth is already based on possibly faulty labelings. Texton histograms obtained during flight will not perfectly match the ones in the training data set: blur, lighting settings, viewing angles and other variables change the shape of the texton histograms. To filter out outliers and smooth the estimations, a popular filter choice is the Kalman filter. However, the Kalman filter is not able to represent multimodal probability distributions. This makes it rather unsuitable for the presented approach: if the $k$NN model outputs two predictions, one would need to use average of these predictions and feed this value to the Kalman Filter. This approach can lead to biased predictions, especially, if the the model outputs belong to distant locations--- to similar texton distributions at these positions. Instead, the more powerful general Bayesian filter could simultaneously keep account of both possible locations and resolve the ambiguity as soon as one location can be favored. In this case, the predictions of the $k$ neighbors are directly fed into the particle filter without averaging them first. The filter is able to smooth the estimations, handle uncertainty, and simultaneously keep track of several competing position estimations. Since the calculation of a full Bayesian filter is computationally intractable, a particle filter which is based on sampling was used as approximation. In general, particle filters estimate the posterior probability of the state given observations. Specifically, this means that finding the position of the UAV can be described as $p(X_t \mid Z_t)$, where $X_t$ is the state vector at time $t$ ($x,y$-position, heading, speed, acceleration) and $Z_t$ are the measurements ($z_1, ..., z_t$, where each $z_i$ represents the $x,y$ output of the proposed algorithm) up to time $t$. The state vector is \emph{hidden}, since the variables cannot be measured directly, therefore the situation can be described using a hidden Markov model. Instead, noisy or ambiguous data can be obtained through the proposed algorithm. The weighted particles are a discrete approximation of the posterior probability function ($pdf$) of the state vector. Particle filters have several advantages. First, one can represent uncertainty by the variance of the state variables of the particles. Second, the particle filter allows for \emph{sensor fusion}, and can integrate IMU data or optical flow into the position estimation. A major disadvantage is the rather high computational complexity. This can be circumvented by reducing the amount of particles (trading off speed and accuracy)---allowing for adapting the computational payload to the used processor. The used particle filter is initialized using 100 particles at random $x, y$-positions. To incorporate the distances, the sensor model $p(z \midx) = \frac{p(x \mid z)p(z)}{p(x)}$ is used, where $\textbf{x} = ((x_1, y_1), (x_2, y_2), \ldots, (x_k, y_k))^T$. A two-dimensional Gaussian model was used for each point. The parameters of the Gaussian have been determined by comparing of the positions based on the motion tracking system with the predictions of the proposed system. This results in values for the variances in $x$ and $y$, the correlation $\rho$ between $x$ and $y$. The mean values $\mu$ were set to zero (no-systematic bias). Figure~\ref{fig:measurementmodel} shows the results of one such evaluation. \begin{figure}[h!] \begin{center} \includegraphics[width=0.7\columnwidth]{figures/measurement_model/default_figure} \caption{{\label{fig:measurementmodel} Measurement model showing the delta x and delta y positions% }} \end{center} \end{figure} This allows to make use of the information in all $k$ neighbors and keep track of a multimodal distribution. While keeping track of a multimodal distribution allows for incorporating several possible states of the system, the problem arises of which mode is the best one. Using a weighted average of the modes would again introduce the problem, that the weighted average falls into a low density region. Therefore, the maximum a posteriori estimate, as described in \cite{driessen2008map} is used. This approach uses the following formula to obtain the MAP estimate: Therefore, the final position estimate is equal to the position of one of the particles. \begin{align} s_k^{MAP} &= \argmax_{s_k}{p(s_k \mid Z_k)}\\ &= \argmax_{s_k}{p(z_k \mid s_k) p(s_k \mid Z_{k-1})} \end{align} Therefore, the MAP estimator in our case is \begin{align} s_k^{MAP} = \argmax_{s_k} \lambda^M(s_k) \end{align} with \begin{align} \lambda^M(s_k) = p(z_k \mid s_k) \sum_{j=1}^Mp(s_k \mid s_{k-1}^j)w^j_{k-1} \end{align} This function is now only evaluated at a finite, chosen number of states, the particles, using \begin{align} \hat{s}_k^{MAP} = \argmax_{s_k \in \{s_k^i \mid i=1,\ldots,N\}} \lambda^M(s_k) \end{align} In this formula, $p(z_k \mid s_k)$ is the likelihood of the particle $s_k$ given the current measurement $z_k$. In our setting, this probability is equal to the weight of the particle $z_k$, therefore $p(z_k \mid s_k) = w^i_k$. The estimation of \emph{uncertainty} is a core part of the proposed approach, being important for safety and accuracy. Therefore, uncertainty was modeled using the spread of the particles. In every time step, the particles of the filter get updated based on the optical flow estimates. These estimates are noisy, as illustrated in Figure~\ref{fig:edgeflow}. Additionally, optical flow estimates aggregate noise over time, since each estimate is dependent on the previous one, leading to drift (\emph{relative position estimates}). In contrast, the machine learning-based method makes independent predictions. While this allow for avoiding accumulating errors, the predictions do not dependent on each other, and might be `jumping' between two points. To combine the advantages of both methods, and leverage out the disadvantages, the particle filter is used. An idea was to include the similarity to the neighbors as confidence value, thus reducing the measurement noise, if a high similarity between current histogram and a training histogram is achieved. However, we found no correlation between these variables. Figure~\ref{fig:cor_sim_measurement} displays the dependence structure.