Audio-Visual Approaches for Music Information Retrieval
Audio-Visual Approaches for Music Information Retrieval (emilia [dot] gomez [at] upf [dot] edu, gloria [dot] haro [at] upf [dot] edu)
Music is a highly multimodal concept, where various types of heterogeneous information are associated to a music piece (audio, musician’s gestures and facial expression, lyrics, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to multimodal methods for content-based semantic description of music material.
In this project, we research on the complementarity of audio and image/video description algorithms for the automatic description and indexing of user-generated music performance videos. We address relevant music information research tasks, in particular music instrument recognition, synchronization of audio / video streams, similarity, quality assessment, structural analysis and segmentation and automatic video mashup generation. In order to do so, we develop strategies to build multimedia repositories and gather human annotations.
Research topics: music information retrieval, automatic classification, image processing, machine learning.
This research is related to the Maria de Maeztu Strategic Program on Data-Driven Knowledge Extraction (https://portal.upf.edu/web/mdm-dtic/home). Our project deals with large-scale multimedia data. You can watch a video presenting our project here. This line of research involves a main collaboration of faculty members from two different groups: Music Information Research Lab at the Music Technology Group (Emilia Gómez), Image Processing Group (Gloria Haro).
O. Slizovskaia, E. Gómez & G. Haro (2016). Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies. 13th Sound and Music Computing Conference (SMC 2016).
S. Essid and G. Richard, “Fusion of Multimodal Information in Music Content Analysis”. in Meinard Müller, Masataka Goto and Markus Schedl (Eds) “Multimodal Music Processing”, Dagstuhl Follow-ups, volume 3, pp. 37-53, ISBN 978-3-939897-37-8, 2012.
M. Müller, M. Goto and M. Schedl (Eds) “Multimodal Music Processing”, Dagstuhl Follow-ups, volume 3, ISBN 978-3-939897-37-8, 2012.