Automatic sound annotation

Cano, P.; Koppenberger, M.

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Automatic sound annotation

Title	Automatic sound annotation
Publication Type	Conference Paper
Year of Publication	2004
Authors	Cano, P. , & Koppenberger M.
Abstract	Sound engineers need to access vast collections of sound effects for their film and video productions. Sound effects providers rely on text-retrieval techniques to offer their collections. Currently, annotation of audio content is done manually, which is an arduous task. Automatic annotation methods, normally fine-tuned to reduced domains such as musical instruments or reduced sound effects taxonomies, are not mature enough for labeling with great detail any possible sound. A general sound recognition tool would require first, a taxonomy that represents the world and, second, thousands of classifiers, each specialized in distinguishing little de- tails. We report experimental results on a general sound annotator. To tackle the taxonomy definition problem we use WordNet, a semantic network that organizes real world knowledge. In order to overcome the need of a huge number of classifiers to distinguish many different sound classes, we use a nearest-neighbor classifier with a database of isolated sounds unambiguously linked to Word- Net concepts. A 30% concept prediction is achieved on a database of over 50.000 sounds and over 1600 concepts.
preprint/postprint document	files/publications/mlsp2004-pcano.pdf