Note:
This bibliographic page is archived and will no longer be updated.
For an up-to-date list of publications from the Music Technology Group see the
Publications list
.
Improving Audio Retrieval through Content and Metadata Categorization
Title | Improving Audio Retrieval through Content and Metadata Categorization |
Publication Type | Master Thesis |
Year of Publication | 2015 |
Authors | Parekh, S. |
Abstract | Audio content sharing on online platforms has become increasingly popular. This necessitates development of techniques to better organize and retrieve this data. In this thesis we look to improve audio retrieval through content and metadata categorization in the context of Freesound. For content, we focus on organiza- tion through morphological description. In particular, we propose a taxonomy and thresholding-based classification approach for loudness proles. The approach can be generalized to structure information about the temporal evolution of other sound attributes. To this end, we also discuss our preliminary ndings from extension of this methodology to pitch proles. On the other hand, meta- data systematization has been approached through a topic model known as the Latent Dirichlet Allocation (LDA). Herein automatic clustering of tag information is performed to achieve a higher level representation of each audio le in terms of 'topics'. We evaluate our approach for both the tasks through several experiments con- ducted over two datasets. This thesis finds immediate application in online au- dio sharing platforms and opens up several interesting future research avenues. Specifically, evaluation indicates that our methods can be immediately applied to improve Freesound's similarity and context-based search. Moreover, we believe our work on content categorization makes it possible to include an advanced content-based search facility in Freesound. |