Xavier Serra and Martí Umbert participate to the 8th ISCA Speech Synthesis Workshop (SSW8) that takes place in Barcelona from August 31st to September 2nd, 2013. Xavier gives a keynote on "Singing voice synthesis in the context of music technology research" and Martí presents a paper on "Systematic database creation for expressive singing voice synthesis control".
Martí Umbert, Jordi Bonada, Merlijn Blaauw: "Systematic database creation for expressive singing voice synthesis control"
Abstract: Systematic database creation for expressive singing voice synthesis control In the context of singing voice synthesis, the generation of the synthesizer controls is a key aspect to obtain expressive performances. In our case, we use a system that selects, transforms and concatenates units of short melodic contours from a recorded database. This paper proposes a systematic procedure for the creation of such database. The aim is to cover relevant style-dependent combinations of features such as note duration, pitch interval and note strength. The higher the percentage of covered combinations is, the less transformed the units will be in order to match a target score. At the same time, it is also important that units are musically meaningful according to the target style. In order to create a style-dependent database, the melodic combinations of features to cover are identified, statistically modeled and grouped by similarity. Then, short melodic exercises of four measures are created following a dynamic programming algorithm. The Viterbi cost functions deal with the statistically observed context transitions, harmony, position within the measure and readability. The final systematic score database is formed by the sequence of the obtained melodic exercises.
Xavier Serra: "Singing voice synthesis in the context of music technology research"
Abstract: The synthesis of the singing voice has always been very much tied to speech synthesis. Since the initial work of Max Mathews with Kelly and Lochbaum at Bell Labs in the 1950s many engineers and musicians have explored the potential of speech processing techniques in music applications. After reviewing some of this history I will present the work done in my research group to develop synthesis engines that could sound as natural and expressive as a real singer, or choir, and whose inputs could be just the score and the lyrics of the song. Some of this research is being done in collaboration with Yamaha and has resulted in the Vocaloid software synthesizer. In the talk I want to make special emphasis on the specificities of the music context and thus on the technical requirements needed for the use of a synthesis techology in music applications.