Back Martín Haro defends his PhD thesis on November 22nd

Martín Haro defends his PhD thesis on November 22nd

15.11.2013

 

22 Nov 2013

Martín Haro defends his PhD thesis entitled "Statistical Distribution of Common Audio Features" on Friday November 22nd 2013 at 11:00h in room 55.309 of the Communication Campus of the UPF.

The jury of the defense is: Josep Lluis Arcos (IIIA-CSIC), Emilia Gómez (UPF) and Jean-Julien Aucouturier (IRCAM).

Abstract: In the last few years some Music Information Retrieval (MIR) researchers have spotted important drawbacks in applying standard successful-inmonophonic algorithms to polyphonic music classification and similarity assessment. Noticeably, these so called “Bag-of-Frames” (BoF) algorithms share a common set of assumptions. These assumptions are substantiated in the belief that the numerical descriptions extracted from short-time audio excerpts (or frames) are enough to capture relevant information for the task at hand, that these frame-based audio descriptors are time independent, and that descriptor frames are well described by Gaussian statistics. Thus, if we want to improve current BoF algorithms we could: i) improve current audio descriptors, ii) include temporal information within algorithms working with polyphonic music, and iii) study and characterize the real statistical properties of these frame-based audio descriptors. From a literature review, we have detected that many works focus on the first two improvements, but surprisingly, there is a lack of research in the third one. Therefore, in this thesis we analyze and characterize the statistical distribution of common audio descriptors of timbre, tonal and loudness information. Contrary to what is usually assumed, our work shows that the studied descriptors are heavy-tailed distributed and thus, they do not belong to a Gaussian universe. This new knowledge led us to propose new algorithms that show improvements over the BoF approach in current MIR tasks such as genre classification, instrument detection, and automatic tagging of music. Furthermore, we also address new MIR tasks such as measuring the temporal evolution of Western popular music. Finally, we highlight some promising paths for future audio-content MIR research that will inhabit a heavy-tailed universe.

 

Multimedia

Categories:

SDG - Sustainable Development Goals:

Els ODS a la UPF

Contact