Cadence Detection in Western Traditional Stanzaic Songs using Melodic and Textual Features



Many Western songs are hierarchically structured in stanzas and phrases. The melody of the song is repeated for each stanza, while the lyrics vary. Each stanza is subdivided into phrases. It is to be expected that melodic and textual formulae at the end of the phrases offer intrinsic clues of closure to a listener or singer. In the current paper we aim at a method to detect such cadences in symbolically encoded folk songs. We take a trigram approach in which we classify trigrams of notes and pitches as cadential or as non-cadential. We use pitch, contour, rhythmic, textual, and contextual features, and a group of features based on the conditions of closure as stated by Narmour (1990). We employ a random forest classification algorithm. The precision of the classifier is considerably improved by taking the class labels of adjacent trigrams into account. An ablation study shows that none of the kinds of features is sufficient to account for good classification, while some of the groups perform moderately well on their own.


This paper presents both a method to detect cadences in Western folk-songs, particularly in folk songs from Dutch oral tradition, and a study to the importance of various musical parameters for cadence detection.

There are various reasons to focus specifically on cadence patterns. The concept of cadence has played a major role in the study of Western folk songs. In several of the most important folks song classification systems, cadence tones are among the primary features that are used to put the melodies into a linear ordering. In one of the earliest classification systems, devised by Ilmari Krohn (REF 1903), melodies are firstly ordered according to the number of phrases, and secondly according to the sequence of cadence tones. This method was adapted for Hungarian melodies by Bártok and Kodály (REF Suchoff, 1981), and later on for German folk songs by Suppan and Stief (REF 1976) in their monumental Melodietypen des Deutschen Volksgesanges. Bronson (REF 1950) introduced a number of features for the study of Anglo-American folk song melodies, of which final cadence and mid-cadence are the most prominent ones. One of the underlying assumptions of this choice of features is that the sequence of cadence tones is relatively stable in the process of oral transmission. Thus, variants of the same melody are expected to end up near to each other in the resulting ordering.

There is also a more phenomenological reason to study cadences. In Western oral tradition the ‘song’ can be considered an identifiable piece of music since it gets a name and the singing of a certain song is a relatively independent act of music making. A cadence pattern functions as a demarcation of the end of the song as a sounding object in time, since it frames the song from sound not belonging to it. This raises the question whether the patterns that accommodate this demarcation show specific characteristics compared to patterns that are in the midst of the song.

From a music cognition point of view, closely related questions can be asked. The perception of closure is of fundamental importance for the understanding of a melody. In terms of expectation (Narmour, 1990; Huron, 2006), a final cadence implies no continuation at all. It is to be expected that specific features of the songs that are related to closure show different values for cadential patterns as compared to non-cadential patterns. We explicitly define a subset of features that is related to conditions of closure as stated by Narmour (1990, p.11).

Cadence detection is related to the problem of segmentation, which is relevant for Music Information Retrieval (REF). Most segmentation methods for symbolic representations of melodies are either based on pre-defined rules (LBDM REF, Grouper REF) or on statistical learning (Juhasz, IDyOM, DOP). In the current paper, we focus on the musical properties of cadence formulae rather than on the taks of segmentation as such.

Taking Dutch folk songs as case, we investigate whether it is possible to derive a general model of the melodic patterns or formulae that specifically indicate melodic cadences using both melodic and textual features. To address this question, we take a computational approach by employing a random forest classifier (Section \ref{sec:REF}). Applying this algorithm results in a generalized model that separates cadence patterns from non-cadence patterns.

To investigate which musical parameters are of importance for cadence detection, we perform an ablation study in which we subsequently remove certain types of features in order to evaluate the importance of the various kinds of features (Section \ref{REF}).


We perform all our experiments on a set of 4,120 symbolically encoded Dutch Folk Songs.1 Roughly half of it consists of transcriptions from field recordings that were made in the Netherlands during the 20th century. The other half is taken from song books that contain reperoire that is directly related to the recordings. Thus, we have a coherent collection of songs that reflects Dutch everyday song culture in the early 20th century. Virtually all of these songs have a stanzaic structure. Each stanza the melody repeats, and each stanza consists of a number of phrases. Both in the transcriptions and in the song books, phrase endings are indicated. Figure \ref{fig:pitchtrigrams} shows a typical song from the collection. The language of the songs is standard Dutch with ocassonally some dialect words or nonsense syllables. All songs were digitally encoded by hand at the Meertens Institute (Amsterdam) and are available in Humdrum *kern format. The phrase endings were encoded as well, and are thus available for computational analysis and modelling.

  1. MTC-FS 1.0; Data set to be released in 2014.

Our Approach

Our general approach is to isolate trigrams from the melodies and to label those either as cadential or non-cadential. A cadential trigram is the last trigram in a phrase. We compare two kinds of tirgrams: trigrams of successive notes (note-trigrams), and trigrams of successive pitches