What/when causal expectation modelling in monophonic pitched and percussive audio

TitleWhat/when causal expectation modelling in monophonic pitched and percussive audio
Publication TypeConference Paper
Year of Publication2007
Conference NameNIPS-Workshop Music, Brain, & Cognition
AuthorsHazan, A., Brossier P., Marxer R., & Purwins H.
AbstractA causal system for representing a musical stream and generating further expected events is presented. Starting from an auditory front-end which extracts low-level (e.g. spectral shape, MFCC, pitch) and mid-level features such as onsets and beats, an unsupervised clustering process builds and maintains a set of symbols aimed at representing musical stream events using both timbre and time descriptions.

The time events are represented using inter-onset intervals relative to the beats. These symbols are then processed by an expectation module based on Predictive Partial Match, a multiscale technique based on N-grams. To characterise the system capacity to generate an expectation that matches its transcription, we use a weighted average F-measure, that takes into account the uncertainty associated with the unsupervised encoding of the musical sequence. The potential of the system is demonstrated in the case of processing audio streams which contain drum loops or monophonic singing voice.

In preliminary experiments, we show that the induced representation is useful for generating expectation patterns in a causal way. During exposure, we observe a globally decreasing prediction entropy combined with structure-specific variations.

preprint/postprint documentfiles/publications/79ee88-NIPS-2007-ahazan.pdf