Design and Evaluation of a Visualization Interface for Querying Large Unstructured Sound Databases

TitleDesign and Evaluation of a Visualization Interface for Querying Large Unstructured Sound Databases
Publication TypeMaster Thesis
Year of Publication2010
AuthorsFont, F.
Abstract

Search is an underestimated problem that plays a big role in any application dealing with large databases. The more extensive and heterogeneous our data is, the harder is to find exactly what we are looking for. This idea resembles the data availability paradox stated by Woods: "more and more data is available, but our ability to interpret what is available has not increased". Then the question arises: is it really useful to collect a big dataset even if we do not have the ability to successfully navigate among it?

According to Morville and Callender, search is a grand challenge that can be succeeded with courage and vision. A good searching tool completely improves the exploitation we can do of our information resources. As a consequence, commonly used search methods must evolve. Search goal is more than finding, search should become a conversation process where answers change the questions.

Having stated all that, it seems clear that extensive effort should be invested on the research and design of appropriate tools for finding our needles in the haystack. However, search is a problem that does not have a general solution. It must be adapted to the context of the information we are dealing with, in the case of the presnet document, unstructured sound databases.

The aim of this thesis is the design of a visualization interface that let users graphically define queries for the Freesound Project database (http://www.freesound.org) and retrieve suitable results for a musical context. Music Information Retrieval (MIR) techniques are used to analyze all the files in the database and automatically extract audio features concerning four different aspects of sound perception: temporal envelope, timbre, tonal information and pitch. Users perform queries by graphically specifying a target for each one of these perceptual aspects, that is to say, queries are specified by defining the physical properties of the sound itself rather than indicating its source (as is usually done in common text-based search engines). Similarity search is performed among the whole database to find the most similar sound files, and returned results are represented as points in a two-dimensional space that users can explore.

Final publicationhttps://doi.org/10.5281/zenodo.1173914
intranet