Authorea

Dylan Freedman edited Fourier.tex about 9 years ago

Commit id: 9f80d8dcf6b6ce4be13f154a6d86b373e2c62f41

deletions | additions

The \textit{Fourier transform} is a mathematical algorithm that can be used to extract frequencies from a series of amplitudes. The Fourier transform can be applied to an audio file using a \textit{sliding window} in which the data points are analyzed in chunks. The window is of a set size to contain a certain number of data points and traverses the data linearly in equal, potentially overlapping steps. The size of the window is a balance in precision---the larger the window size the finer the frequency resolution; the smaller the window size the more closely note onsets and offsets can be detected. Most modern audio files are sampled at 44000 $Hz$, which means that there are 44,000 data points for every second of audio. Let $sr$ denote the sampling rate of an audio file in $Hz$. For a given segment of audio consisting of $n$ data points, the Fourier transform returns $n$ values, where the magnitude of the $i$th value corresponds to the strength of the frequency $\frac{sr \cdot i}{n}Hz$. A graph of these values with time along the x-axis, frequency along the y-axis, and intensity represented by color is called a \textit{spectrogram}. An example spectrogram of The Beatles song \textit{Eleanor Rigby} is given in figure~\ref{fig:eleanor_rigby}.