Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016

Bonada, J.; Umbert, M.; Merlijn Blaauw

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016

Title	Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016
Publication Type	Conference Paper
Year of Publication	2016
Conference Name	Interspeech
Authors	Bonada, J. , Umbert M. , & Blaauw M.
Conference Start Date	13/09/2016
Conference Location	San Francisco, USA
Abstract	Sample and statistically based singing synthesizers typically require a large amount of data for automatically generating expressive synthetic performances. In this paper we present a singing synthesizer that using two rather small databases is able to generate expressive synthesis from an input consisting of notes and lyrics. The system is based on unit selection and uses the Wide-Band Harmonic Sinusoidal Model for transforming samples. The first database focuses on expression and consists of less than 2 minutes of free expressive singing using solely vowels. The second one is the timbre database which for the English case consists of roughly 35 minutes of monotonic singing of a set of sentences, one syllable per beat. The synthesis is divided in two steps. First, an expressive vowel singing performance of the target song is generated using the expression database. Next, this performance is used as input control of the synthesis using the timbre database and the target lyrics. A selection of synthetic performances have been submitted to the Interspeech Singing Synthesis Challenge 2016, in which they are compared to other competing systems.

interspeech-singing-challenge.pdf