Kim H. Parker edited Because_I_is_calculated_from__.tex  over 8 years ago

Commit id: a05d88115bae360d48384a4a8cdc069fe4e53b9a

deletions | additions      

       

Because $I$ is calculated from probability density functions that are estimated from the histograms, the results are sensitive to the number and the size of the bins used to calculate the histograms. For a small number of bins, there are a large number of samples in each bin and so the approximation to the underlying probability is better. At the same time, there are fewer values of $X$ and $Y$ and so the mutual information decreases. This can be seen most easily in the values of $I(P,P)$, $I(U,U)$ and $I(P,U)$ indicated by the horizontal dashed lines. We also observe that the value of $c$ for which $I(dP_+,dP_-)$ indicated by the red circle is minimum tends is constant for a relatively small number of bins but begins to increase when the number of bins increases. The constancy of the value of $c$ for which $I$ is minimum For a small number of bins (bins < 256)  indicates that there may be an algorithm for determining the appropriate bin size for any particular measurement that will result in a robust estimate of $c$. In this particular example $c_{MI}$ is only slightly larger than $c_{SS}$