Authorea

Lucas Fidon edited subsection_Approximation_of_probability_distribution__.tex almost 8 years ago

Commit id: ec493a23caf73bd1892f84a7f957df7e03b7a4b6

deletions | additions

Besides we consider sample of 3 minutes of the soccer match. Those values result of a trade off between sparsity and precision of the model. \subsection{A \subsubsection{A first approach: histogram} The easiest way to approximate the probability distribution i.e. all $p(x,y)$ for $(x,y)$ the bin's coordinates of the discretized field (we will come to that later) is to use histogram. $p(x,y)$ is then the occurence ratio of $(x,y)$ among the whole set of positions traveled by the trajectory $T$. in another words $p(x,y)$ is given by: \[P_{hist}((x,y),T) = \sum_{(x',y')\in T}\mathbb{1}_{(x=x',y=y')}\] \subsubsection{sparsity problem} However with the histogram method the empirical distribution remain too sparse, which lead to inconsistant empirical mutual information values. To cope with sparsity we used Parzen windowing as it is described in \cite{Pluim_2003} for instance. Given a trajectory $T$, the probability $p(x,y)$ of $(x,y)$ is the sum of the contribution of each $(x',y')$ in $T$. The contributions are functions of a Gaussian kernel.