Authorea

Lucas Fidon edited subsection_Approximation_of_probability_distribution__.tex almost 8 years ago

Commit id: cba42c2a1985ddc5026f1686406a3a0a7bb5bca0

deletions | additions

in another words $p(x,y)$ is given by: \[P_{hist}((x,y),T) = \sum_{(x',y')\in T}\mathbb{1}_{(x=x',y=y')}\] \subsubsection{sparsity problem} However with the histogram method the empirical distribution remain too sparse, which lead to inconsistant empirical mutual information values. To cope with sparsity we used Parzen windowing as it is described in \cite{Pluim_2003} for instance. Given a trajectory $T$, the probability $p(x,y)$ of $(x,y)$ is the sum of the contribution of each $(x',y')$ in $T$. The contributions are functions of a Gaussian kernel. Hence the following definition of the probability of $(x,y)$ given $T$: \[P_{PW}((x,y),T) = \sum_{(x',y')\in T}K((x,y),(x',y'))\] where $K$ is a gaussian kernel. In practice we take a discrete gaussian kernel filter for $K$, given by the 3x3 matrix: \[ M_{K} = \left( \begin{array}{ccc} 0.0625 & 0.125 & 0.0625 \\ 0.125 & 0.25 & 0.125 \\ 0.0625 & 0.125 & 0.0625 \end{array} \right).\] Whereas the simple histogram method places a spike function (i.e. $K = \delta$) at the bin corresponding to $(x,y)$ and update only a single bin, Parzen windowing places a kernel at the bin of $(x,y)$ and updates all bins falling under the kernel with the corresponding kernel value. As a result using a gaussian filter, the estimated distributions are more smooth and less sparse. The previous formula stand for distribution of position but we approximate distribution and joint distributions of position or acceleration similarly.