Descriptor Control of Sound Transformations and Mosaicing Synthesis

TitleDescriptor Control of Sound Transformations and Mosaicing Synthesis
Publication TypePhD Thesis
Year of Publication2016
UniversityUniversitat Pompeu Fabra
AuthorsColeman, G.
AdvisorBonada, J., & Serra X.
Academic DepartmentDepartment of Information and Communication Technologies
Number of Pages193
Date Published02/2016
Keywordsadaptive effects, concatenative sound synthesis, curse of dimensionality, listening tests, Machine learning, mosaicing, nonlinear regression, sampling synthesis, sound effects, sound texture transfer, sound transformations, sparse approximation, structured sparsity, subjective evaluation of audio

Sampling, as a musical or synthesis technique, is a way to reuse recorded musical expressions. In this dissertation, several ways to expand sampling synthesis are explored, especially mosaicing synthesis, which imitates target signals by transforming and compositing source sounds, in the manner of a mosaic made of broken tile.

One branch of extension consists of the automatic control of sound transformations towards targets defined in a perceptual space. The approach chosen uses models that predict how the input sound will be transformed as a function of the selected parameters. In one setting, the models are known, and numerical search can be used to find sufficient parameters; in the other, they are unknown and must be learned from data.

Another branch focuses on the sampling itself. By mixing multiple sounds at once, perhaps it is possible to make better imitations, e.g. in terms of the harmony of the target. However, using mixtures leads to new computational problems, especially if properties like continuity, important to high quality sampling synthesis, are to be preserved.

A new mosaicing synthesizer is presented which incorporates all of these elements: supporting automatic control of sound transformations using models, mixtures supported by perceptually relevant harmony and timbre descriptors, and preservation of continuity of the sampling context and transformation parameters. Using listening tests, the proposed hybrid algorithm was compared against classic and contemporary algorithms, and the hybrid algorithm performed well on a variety of quality measures.

Sounds referenced in the text, mainly those used in the listening tests, are presented as a webpage of sound examples.