Generating Singing Voice Expression Contours Based on Unit Selection

Umbert, M.; Bonada, J.; Merlijn Blaauw

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Generating Singing Voice Expression Contours Based on Unit Selection

Title	Generating Singing Voice Expression Contours Based on Unit Selection
Publication Type	Conference Paper
Year of Publication	2013
Conference Name	Stockholm Music Acoustics Conference
Authors	Umbert, M. , Bonada J. , & Blaauw M.
Pagination	315-320
Conference Start Date	30/07/2013
Conference Location	Stockholm, Sweden
Abstract	A common problem of many current singing voice synthesizers is that obtaining a natural-sounding and expressive performance requires a lot of manual user input. This thus becomes a time-consuming and difficult task. In this paper we introduce a unit selection-based approach for the generation of expression parameters that control the synthesizer. Given the notes of a target score, the system is able to automatically generate pitch and dynamics contours. These are derived from a database of singer recordings containing expressive excerpts. In our experiments the database contained a small set of songs belonging to a single singer and style. The basic length of units is set to three consecutive notes or silences, representing a local expression context. To generate the contours, first an optimal sequence of overlapping units is selected according to a minimum cost criteria. Then, these are time scaled and pitch shifted to match the target score. Finally, the overlapping, transformed units are crossfaded to produce the output contours. In the transformation process, special care is taken with respect to the attacks and releases of notes. A parametric model of vibratos is used to allow transformation without affecting vibrato properties such as rate, depth or underlying baseline pitch. The results of a perceptual evaluation show that the proposed approach is comparable to parameters that are manually tuned by expert users and outperforms a baseline system based on heuristic rules.

UmbertBonadaBlaauwSMAC2013.pdf