Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Properly Using Speech Synthesis and Voice Transformation for Audiovisual Content Generation

Title Properly Using Speech Synthesis and Voice Transformation for Audiovisual Content Generation
Publication Type Conference Paper
Year of Publication 2009
Conference Name International Broadcasting Conference (IBC2009)
Authors Monzo, C. , Formiga L. , Adell J. , Mayor O. , Bonada J. , Janer J. , & Iriondo I.
Conference Start Date 10/09/2009
Publisher IBC
Conference Location Amsterdam, The Netherlands
Abstract During the creation process, scriptwriters might want to quickly watch at the result of what they are creating. Text-to-Speech (TTS) systems offer the opportunity to deliver speech in a small amount of time. In addition, information might be dynamically generated by intelligent systems and TTS is crucial to deliver speech. The main drawback of the TTS utilization in audiovisual productions is that commercial systems offer few different voices. However, productions need a different voice for each involved character. Voice Transformation (VT) techniques can be used to overcome this limitation, allowing the user to personalize the voice for each character. In this paper, we will explain the technologies involved in TTS and VT systems and their combination in a nutshell. Finally, we present a study about the most efficient way to combine them: either convert the synthesized speech, or generate a new synthetic voice by converting the original speech database used in the TTS system.