Alexander Martin edited Entropy.tex  about 9 years ago

Commit id: 2a9e0e134a931c8dca438969c3f89b5ce1e12d97

deletions | additions      

       

\subsection{Information Entropy}  A formal definition of functional load was given by Charles Hockett \citeyearpar{Hockett1955}. His calculation was based on a measure of the entropy \citep{Shannon1948} in the language in question, the formula for which can be seen in equation~\ref{eq:entropy} Equation~\ref{eq:entropy}  (for each phoneme $\phi$ in the language's phoneme inventory $\Phi$) where the probability of a phoneme is calculated as its relative frequency. \begin{equation}\label{eq:entropy}  H(L) = -\sum_{\phi\in\Phi} P\,(\phi)\text{log}_2P\,(\phi) 

Work on this formality was later expanded to include any type of contrast imaginable within the phonology of a language. Hockett's formula was designed specifically to calculate the functional load of a contrast between two phonemes, but its basic principles can (and have) been extended to featural, syllabic, and even whole-word contrasts \citep{Surendran2003,Surendran2006}.   This measure has been used and compared with the simple minimal pair counts method \citep[cf.][]{Wedel2013} as a predictor for phoneme loss, but it required further evaluation before we could use it ourselves. If we recall our requirements for measuring functional load, we must be able to take into account specific phoneme frequency. This is not possible to calculate using a measure of entropy when we are considering features. If we return to Shannon's adapted formula (equation~\ref{eq:entropy}), (Equation~\ref{eq:entropy}),  we will no longer consider $\phi$ to be a phoneme, but rather a feature. This means that we will calculate the relative frequency of a featural contrast (the number of minimal pairs observed in a given feature). We have therefore not solved the problem raised earlier regarding phoneme frequency. An important issue with Hockett's method is the results that it gives. After implementing Hockett's method, the resultant data are a series of values measuring the functional load of each feature. The \emph{voicing} feature will have one value, and the \emph{place} feature another. It is, however, very difficult to compare these values to one another. As can be seen in Table~\ref{tab:surendran}\footnote{\citeauthor*{Surendran2003} refer to the \emph{voicing} feature as \emph{aspiration} in Mandarin, although both are modulations of \abv{VOT} and it is appropriate for our analyses to collapse these phonetic differences. We will consider all laryngeal features together for the purposes of this study.}, there are patterns in the differences between the various features (\emph{place} has a higher value than \emph{manner} which has a higher value than \emph{voicing} in all four languages). We are incapable, though, of saying to what extent these differences are important. We cannot easily determine to what extent \emph{place} is more important than \emph{manner} in a given language, nor if these differences are important cross-linguistically.