Talk on factor analysis for audio classification tasks by Hamid Eghbal-zadeh
Talk on factor analysis for audio classification tasks by Hamid Eghbal-zadeh
28.07.2016
1 Aug 2016
On Monday, 1st of August at 15:00h in room 55.410 there will be a talk by Hamid Eghbal-zadeh (Department of Computational Perception, Johannes Kepler University of Linz, Austria) on "A small footprint for audio and music classification".
Abstract: In many audio and music classification tasks, the aim is to provide a low-dimensional representation for audio excerpts with a high discrimination power to be used as excerpt-level features instead of the audio feature sequence. One approach would be to summarize the acoustic features into a statistical representation and use it for classification purposes. A problem of many of the statistical features such as adapted GMMs is that they are very high dimensional and also capture unwanted characteristics about the audio excerpts which does not represent their class. Using Factor Analysis, the dimensionality can be dramatically reduced and the unwanted factors can be discarded from the statistical representations. The state-of-the-art in many speech-related tasks use a specific factor analysis to extract a small footprint from speech audios. This fixed-length low-dimensional representation is known as i-vector. I-vectors are recently imported in MIR and have shown a great promise. Recently, we won the Audio Scene Classification challenge (DCASE-2016) using i-vectors. Also, we will present our noise-robust music artist recognition system via i-vector features at ISMIR-2016.