MELODIA - Melody Extraction vamp plug-in

MELODIA - Melody Extraction


The MELODIA plug-in automatically estimates the pitch of a song's main melody. More specifically, it implements an algorithm that automatically estimates the fundamental frequency corresponding to the pitch of the predominant melodic line of a piece of polyphonic (or homophonic or monophonic) music.

Given a song, the algorithm estimates:

  1. When the melody is present and when it is not (a.k.a. voicing detection)
  2. The pitch of the melody when it is present

A non-scientist friendly introduction to Melody Extraction as well as the algorithm, including graphs and sound examples, can be found on:

For computational reasons, MELODIA is composed of two vamp plug-ins: "MELODIA - Melody Extraction" and "MELODIA - Melody Extraction (intermediate steps)". The former provides the main output of MELODIA (the pitch of the predominant melody), whilst the latter provides visualisations of the intermediate steps calculated by the algorithm (see Input/Output below for further details). Both plug-ins are included in a single MELODIA library file available for Windows, OSX and Linux.

Full details of the algorithm can be found in the following paper:

J. Salamon and E. Gómez, "Melody Extraction from Polyphonic Music Signals using Pitch Contour Characteristics", IEEE Transactions on Audio, Speech and Language Processing, 20(6):1759-1770, Aug. 2012.

We would highly appreciate if scientific publications of works partly based on the MELODIA plug-in cite the above publication.

The MELODIA vamp plug-in has been made possible by the kind support of the following entities


Graphical User Interface

MELODIA - Melody Extraction Vamp plug-in

The MELODIA - Melody Extraction Vamp plug-in used in Sonic Visualiser. Top pane: waveform. Second pane: salience function. Third pane: pitch contours (all). Fourth pane: extracted melody (in red) and spectrogram.




Audio file in a format supported by your Vamp host (e.g. wav, mp3, ogg)


MELODIA offers 4 different types of output. The first (Melody) is computed by the "MELODIA - Melody Extraction" plug-in and the rest by the "MELODIA - Melody Extraction (intermediate steps)" plug-in:

  • Melody

The pitch of the main melody. Each row of the output contains a timestamp and the corresponding frequency of the melody in Hertz. Non-voiced segments are indicated by zero or negative frequency values. Negative values represent the algorithm's pitch estimate for segments estimated as non-voiced, in case the melody is in fact present there.

  • Salience Function

A 2D time-frequency representation of pitch salience over time on a cent scale. The salience function covers five octaves, from 55Hz to 1760Hz, divided into 600 bins with a resolution of 10 cents per bin (= 120 bins per octave).

  • Pitch Contours: All

In order to estimate the melody, the algorithm first tracks all salient pitch contours present in the signal. This output is a 2D representation of pitch contours vs. time, using the same scale as the salience function.

  • Pitch Contours: Melody

This output is the same as output "Pitch Contours: All", except that only contours which were identified by the algorithm as part of the melody are displayed. By comparing "Pitch Contours: All" and "Pitch Contours: Melody" you can observe how the algorithm filters out non-melody pitch contours.



Please go to the Download page