Comparing Harmonic Similarity Measures Using Extracted Chord Features and Ground-Truth Data


Overview of Music and Computation


A Primer on Western Music Theory

Notes: The Basic Building Block

In music, a note is the most basic element. A note is based on pitch, a subjective and perceptual property. Though the pitch of a note is closely related and usually resembles its objective physical frequency (as measured in Hertz, or cycles per second, of a waveform), pitch differs in that its semantic meaning is derived from the listener. This distinction can be demonstrated with a visual analogy used by Terheardt(Terhardt 1974) in figure \ref{fig:virtualpitch2} in which the word aureplacedverbatimaa is apparent even though the visual information suggests only shadow – a pitch can be heard even if its perceived frequency is not physically present. A note also consists of a duration.

[Terheardt’s visual pitch analogy]Terheardt’s visual pitch analogy. In this illusion, the eye perceives contours not present. Pitch describes the information received by a listener even if physical frequencies are not present.

Western music is based on a division of 12 distinct frequencies per octave. An octave is an interval, or distance between two frequencies, that corresponds to a power of 2 multiplication. Musical pitch is perceived in a logarithmic scale—one octave above a given perceived frequency is double that frequency; one octave below is half that frequency. The progression of notes containing all 12 pitches in succession in an octave is called a chromatic scale. A semitone, or half-step, is the smallest interval, equal to \(1/12\) of an octave. \(n\) semitones above a given frequency \(f_0\) or \(-n\) below can be calculated as \(f_0 \cdot 2^{n/12}\).

Note names are used to classify the pitches in the chromatic scale. Note names consist of a base name and 0 or more accidentals. The base names of a note correspond to the white keys on a piano—in any one given octave there are the following names: \(C\), \(D\), \(E\), \(F\), \(G\), \(A\), and \(B\). A base note name can optionally be decorated with an indefinite number of sharps (\(\sharp\)) or flats (\(\flat\)), but not both, in the note name. This can be illustrated with the following context-free grammar (figure \ref{fig:cfgnote}):

\[\begin{aligned} NoteName &\to BaseNote \mid BaseNote\ SharpAccidentals \mid BaseNote\ FlatAccidentals \\ BaseNote &\to \mathbf{C} \mid \mathbf{D} \mid \mathbf{E} \mid \mathbf{F} \mid \mathbf{G} \mid \mathbf{A} \mid \mathbf{B} \\ SharpAccidentals &\to \bm{\textit{\#}}\ SharpAccidentals \mid \bm{\textit{\#}} \\ FlatAccidentals &\to \bm{b}\ FlatAccidentals \mid \bm{b}\end{aligned}\]


Sharps and flats are referred to as accidentals. Each additional (\(\#\)) increases the pitch to which the note name refers by 1 semitone; likewise, each (\(b\)) decreases the pitch by 1 semitone. The black keys on the piano represent pitches 1 semitone in between the surrounding white keys. Each white key is either 1 semitone or 2 semitones apart, depending on if a black key is in the middle. For instance, \(C\) and \(D\) are 2 semitones apart since there is a black key in between them, whereas