Mutual Information: definition and properties

Mutual Information is widely used for instance for registration of medical images as it is depicted in \cite{Pluim_2003}. The main idea is to introduce a feature space (or a joint probability) of the two trajectories we want to compare and to evaluate the quantity of information shared by the two trajectories based on this feature space. This quantity is calculated with Mutual Information. In our case the feature space will be the distribution of the joint positions of two player trajectories during a few minutes (we will come to that later). Therfore, it corresponds to a 4-dimension distribution. The Mutual Infomation of this distribution will be the lynchpin of our metric for trajectories.

Entropy

Shannon introduced the entropy to be a measure of the quantity of information embedded in the distribution of a random variable.

Let \(X: P \rightarrow E\) be a random variable with \(E\) a discrete probability space. The entropy of the probability distribution of \(X\), noted \(S(X)\), is defined as: \[S(X)=-\sum_{x \in E}P_{X}(x)log\big(P_{X}(x)\big)\]

It is noteworthy that, in particular, the entropy of the joint probability distribution of two random variables \(X : P_{1} \rightarrow E_{1}\) and \(Y : P_{2} \rightarrow E_{2}\) is defined as: \[S(X,Y)=-\sum_{(x,y) \in E_1\times E_2}P_{(X,Y)}(x,y)log\big(P_{(X,Y)}(x,y)\big)\]

Besides the entropy of the probability distribution of \(X\) conditionally to the probability distribution of \(Y\) is defined as: \[S(X|Y)=-\sum_{x \in E_1}P_{X|Y}(x)log\big(P_{X|Y}(x)\big)\]

Somewhat imprecisely, we used to designate the entropy of the probability distribution of a random variable \(X\) as simply “the entropy of \(X\)”. Hence the notation \(S(X)\) for the entropy of \(X\).

The entropy of a probability distribution (or of a random variable) has three interpretations:

  • the amount of information embedded in a probability distribution

  • the uncertainty about the outcome of a probability distribution

  • the dispersion of a probability distribution

For more information about entropy the reader can refer to \cite{Pluim_2003} or La théorie de l’information : l’origine de l’entropie.