| Abstract | In the recognition and classification of sounds, extracting perceptually and biologically relevant
features yields much better results than the standard low-level methods (e.g zero-crossings, roll-off, centroid,
energy, etc.). Gamma-tone filters are biologically relevant, as they simulate the motion of the basilar membrane.
The representation techniques that we propose in this paper make use of the gamma-tone filters, combined with
the Hilbert transform or hair cell models, to represent everyday sounds. Different combinations of these methods
have been evaluated and compared in perceptual classification tasks to classify everyday sounds like doors and
footsteps by using support vector classification. After calculating the features a feature integration technique
is applied, in order to reduce the high dimensionality of the features. The everyday sounds are obtained from
the commercial sound database “Sound Ideas”. However, perceptual labels assigned by human listeners are
considered rather than the labels delivered by the actual sound source. These perceptual classification tasks
are performed to classify the everyday sounds according to their function, like classifying the door sounds as
”opening” and ”closing” doors. In this paper, among the gamma-tone-based representation techniques, other
spectral and psycho-acoustical representation techniques are also evaluated. The experiments show that the
gamma-tone-based representation techniques are superior for perceptual classification tasks of everyday sounds.
The gamma-tone filters combined with a inner hair cell model and with the Hilbert transform yield the most
accurate results in classifying everyday sounds.
|