News and Events

Possibility for 3 years postdocs @MTG for researchers outside Spain

The catalan government is opening a call for post-doc researchers to join catalan universities. It is called Beatriu de Pinós program.


  • Have a PhD between 01/01/2009 and 31/12/2014 (even later)
  • Minimum of 2 years of postdoctoral experience outside Spain.
  • Not living in Spain more than 12 months in the lsat 3 years.


  • 2 years duration that can be extended 1 more year. Starting before January 1st 2018.
  • 32.800 EUR / year + 6.000 EUR for supporting research

Deadline: 01/12/2016

More info here.

21 Nov 2016 - 13:33 | view
Master thesis from SMC Master 2015-2016
15 Nov 2016 - 11:02 | view
Talks by Dr. Eita Nakamura and Dr. Shinji Sako
15 Nov 2016

Dr. Eita Nakamura (Kyoto University, Japan) and Dr. Shinji Sako (Nagoya Institute of Technology, Japan)
will be giving two talks:


"Rhythm Transcription of Piano Performances Based on Hierarchical. Bayesian Modelling of Repetition and Modification of Musical Note Patterns" by Dr. Eita Nakamura. Kyoto University, Japan. (15h Nov, 17:00h. Room 52.321)

We present a method of rhythm transcription (i.e., automatic recognition of note values in music performance signals) based on a Bayesian music language model that describes the repetitive structure of musical notes. Conventionally, music language models for music transcription are trained with a dataset of musical pieces. Because typical musical pieces have repetitions consisting of a limited number of note patterns, better models fitting individual pieces could be obtained by inducing compact grammars. The main challenges are inducing appropriate grammar for a score that is observed indirectly through a performance and capturing incomplete repetitions, which can be represented as repetitions with modifications. We propose a hierarchical Bayesian model in which the generation of a language model is described with a Dirichlet process and the production of musical notes is described with a hierarchical hidden Markov model (HMM) that incorporates the process of modifying note patterns. We derive an efficient algorithm based on Gibbs sampling for simultaneously inferring from a performance signal the score and the individual language model behind it. Evaluations showed that the proposed model outperformed previously studied HMM-based models.


"Real-time audio-to-score following and its applications" by Dr. Shinji Sako (and his students). Nagoya Institute of Technology, Japan. (15th Nov, 17:45 h. Room 52.321)

We present a robust on-line algorithm for real-time audio-to-score following based on a delayed decision and anticipation framework. We employ Segmental Conditional Random Fields and Linear Dynamical System to model musical performance by human. The combination of these models allows an efficient iterative decoding of score position and tempo. The combined advantages of our approach are the delayed-decision
Viterbi algorithm which utilizes future information to determine past score position with high reliability, thus improving alignment accuracy, and the fact that the future position can be anticipated using an adaptively estimated tempo. We also talk about interim progress of the research and some applications by using this

15 Nov 2016 - 10:09 | view
Seminar on music knowledge extraction using machine learning
4 Dec 2016

Taking advantage of the researchers coming to Barcelona for the NIPS conference, on December 4th we are organizing a small and informal seminar to discuss on various topics related to machine learning applied to music, putting special emphasis on the knowledge extraction aspects of it.

Full program:

10 Nov 2016 - 19:02 | view
Special ACM-TIST issue on Intelligent Music Systems

A recently issued special of the ACM Transactions on Intelligent Systems and Applications presents some of the recent research on "Intelligent Music Systems", a research area that crosses some of the goals of the GiantSteps EU project. This special issue has been co-edited by Markus Schedl (Johannes Kepler University, Linz, Austria), Yi-Hsuan Yang (Academia Sinica, Taipei, Taiwan), and Perfecto Herrera-Boyer (Universitat Pompeu Fabra & ESMUC, Barcelona, Spain).

It includes, in addition to five technical papers covering topics such as tagging, recommendation, segmentation, music video analysis or score-alignment, two guest articles that reflect on the open issues and neglected topics that could help to improve the "intelligence" of future systems for music analysis and creation.

It also includes three papers co-authored by MTG members:

Schedl, M., Yang Y., Herrera P. (2016).  Introduction to Intelligent Music Systems and Applications

Oramas, S., Ostuni V. C., Di Noia T., Serra X., Di Sciascio E. (2016).  Sound and Music Recommendation with Knowledge Graphs

Rodriguez-Serrano, F., Carabias-Orti, J., Vera-Candeas, P. & Martinez-Muñoz, D. (2016). Tempo Driven Audio-to-Score Alignment Using Spectral Decomposition and Online Dynamic Time Warping



8 Nov 2016 - 11:15 | view
Marie Curie PhD Fellowships for the MTG

Our Department is a hosting institution within the INPhINIT “la Caixa” Fellowships Programme (57 grants), and there are 4 proposals supervised by MTG researchers (see details below).


  • 3-year contract
  • €34,800 gross annual salary + €3,564 annual additional funding
  • Award of €7,500 for the PhD fellow in case he/she presented the thesis within a period of 3.5 years
  • Additional training in transversal skills: technology transfer, entrepreneurship, professional development.
  • Research stays in academia and industry.
  • Participation in networking and outreach activities.
  • Deadline for incorporation of candidates: September/October 2017..


Fellowship eligibility:

  • Be in the first four years (full-time equivalent research experience) of their research careers and not yet have been awarded a doctoral degree.
  • Not have resided or carried out their main activity (work, studies, etc.) in Spain for more than 12 months in the 3 years immediately prior to the recruitment date. Short stays such as holidays will not be taken into account.
  • Have a demonstrable level of English (B2 or higher).

PhD program admission:

  • Accredited undergraduate degree (Bachelor degree or recognised equivalent degree from an accredited Higher Education Institution).
  • Accredited graduate/master's degree (equivalent to a Spanish Master Universitario/Oficial, Master's of Research.... ) which enables them to access a Phd programme in their home country.
  • Total of 300 ECTS credits, 60 of those have to correspond to an official graduate, research oriented, master's programme.

Selection criteria


  • Academic records and CV (50%).
  • Motivation and goals declaration (30%): originality, innovation, impact and link with the selected Research Centre.
  • Recommendation letters (20%).


  • Potential (40%), motivation and impact (20%), CV (30%).

Important dates

  • Website open for applications: November 7th, 2016.
  • Application deadline: February 2nd, 2017.

How to apply
Applications are managed through the program website, More information about the procedure:

For additional information on the proposed topics please contact the PI!

7 Nov 2016 - 10:27 | view
CompMusic Seminar
18 Nov 2016
On November 18th 2016, Friday, from 10h to 18:30h in room 55.410 of the Communication Campus of the Universitat Pompeu Fabra in Barcelona, we will have a CompMusic seminar. This seminar accompanies the PhD thesis defenses of Ajay Srinivasamurthy and Sankalp Gulati, carried out in the context of the CompMusic project.
10:00 Simon Dixon (QMUL, London)
"Music Similarity and Cover Song Identification: The Case of Jazz"
Similarity in music is an evasive and subjective concept, yet computational models of similarity are cited as important for addressing tasks such as music recommendation and the management of music collections. Cover song (or version) identification deals with a specific case of music similarity, where the underlying musical work is the same, but its realisation is different in each version, usually involving different performers and differing arrangements of the music, which may vary in instrumentation, form, tempo, key, lyrics or in other aspects of rhythm, melody, harmony and timbre. The new version retains some features of the original recording, and it is usually assumed that the sequential pitch content (corresponding to melody and harmony) is preserved with limited alterations from the original version.
In music information retrieval, a standard approach to version identification uses predominant melody extraction to represent melodic content and chroma features to represent harmonic content. These features are adapted to allow for variation in key or tempo between versions, and a pairwise sequence matching algorithm computes the pairwise similarity between tracks, which can be used to estimate groups of cover songs. Different versions of a jazz standard can be regarded as a set of cover songs, but the identification of such covers is more complicated than for many other styles of music, due to the improvisatory nature of jazz, which allows ornamentation and transformation of the melody as well as substitution of chords in the harmony. We report on experiments on a set of 300 jazz standards using discrete-valued and continuous-valued measures of pairwise predictability between sequences, based on work with a former PhD student, Peter Foster.
11:00 Geoffroy Peeters (IRCAM, Paris)
"Recent researches at IRCAM related to the recognition of rhythm, vocal imitations and music structure"
In this talk, I will present some recent researches at IRCAM related to - the description of rhythm (especially the use of the Fourier-Mellin transform or of the Modulation Scale Transform with Auditory statistics) - the recognition of vocal imitations (using HMM decoding of SI-PLCA kernels over time) - the estimation of musical structure (using Convolutional Neural Networks).
12:00 Coffe break
12:30 Andre Holzapfel (KTH, Stockholm)
"Tracking time: State-of-the-art and open problems in meter inference"
Throughout the last years, significant progress was made in algorithmic approaches that aim at the recognition of metrical cycles, and the tracking of their structure in music audio signals. The automatic adaption to rhythmic patterns enabled to go beyond manually tailored tracking approaches, and deep learning based features increase the accuracy of the inference given an unknown audio signal. In principle, arbitrary time signatures can be recognized and tracked from a music recording, assuming the existence of a large enough representative dataset to learn from. In this talk a short summary of the state of the art will be provided, and open problems will be presented that represent potential subjects of future studies. These open problems comprise the tracking of metrical cycles of very long duration, the inclusion of modes beyond the acoustic signal, and a variety of subjects that arise within areas like performance studies, music theory, and ethnomusicology.
13:30 Lunch break
15:00 Barış Bozkurt (Koç University, Istanbul)
"Melodic analysis for Turkish makam music"
A makam generally implies a miscellany of rules for melodic composition, a design for melodic contour as a sequence of melodies (from specific categories) emphasising specific tones. This talk will start by presenting melody concepts in Turkish makam music and then continue discussing the methods, uses and  automatisation of melodic analysis for that music tradition. A study on culture-specific automatic melodic segmentation (of scores) will be presented. Finally we will discuss future perspectives for melodic analysis within the context of corpus-based study of makams.
16:00 Juan Pablo Bello (NYU, New York)
"Some Thoughts on the How, What and Why of Music Informatics Research"
The framework of music informatics research (MIR) can be thought of as a closed loop of data collection, algorithmic development and benchmarking. Much of what we do is heavily focused on the algorithmic aspects, or how to optimally combine various techniques from e.g., signal processing, data mining, and machine learning, to solve a variety of problems, from auto-tagging to automatic transcription, that captivate the interest of our community. We are very good at this, and in this talk I will describe some of the know-how that we have collectively accumulated over the years. On the other hand, I would argue that we are less proficient at clearly defining the “what” and “why” behind our work, that data collection and benchmarking have received far less attention and are often treated as afterthoughts, and that we sometimes tend to rely on widespread and limiting assumptions about music that affect the validity and usability of our research. On this, we can learn from other fields, such as music cognition, particularly with regards to the adoption of methods and practices that fully embrace the complexity and variability of human responses to music, while still clearly delineating the scope of the solutions or analyses being proposed.
17:00 Coffee break
17:30 Joan Serrà (Telefónica R+D, Barcelona)
"Facts and myths about deep learning"
Deep learning has revolutionized the traditional machine learning pipeline, with impressive results in domains such as computer vision, speech analysis, or natural language processing. The concept has gone beyond research/application environments, and permeated into the mass media, news blogs, job offers, startup investors, or big company executives' meetings. But what is behind deep learning? Why has it become so mainstream? What can we expect from it? In this talk, I will highlight a number of facts and myths that will provide a shallow answer to the previous questions. While doing that, I will also highlight a number of applications we have worked on at our lab. Overall, the talk wants to place a series of basic concepts, while giving ground for reflection or discussion on the topic.


18 Oct 2016 - 10:07 | view
Ajay Srinivasamurthy and Sankalp Gulati defend their PhD thesis
17 Nov 2016

Thursday, November 17th 2016 at 15:00h in room 55.309 (Tanger Building, UPF Communication Campus)

Ajay Srinivasamurthy: “A Data-driven Bayesian Approach to Automatic Rhythm Analysis of Indian Art Music”
Thesis director: Xavier Serra
Thesis Committee: Simon Dixon (QMUL), Geoffroy Peeters (IRCAM) and Juan Pablo Bello (NYU)
[Full thesis document and accompanying materials]

Abstract: Large and growing collections of a wide variety of music are now available on demand to music listeners, necessitating novel ways of automatically structuring these collections using different dimensions of music. Rhythm is one of the basic music dimensions and its automatic analysis, which aims to extract musically meaningful rhythm related information from music, is a core task in Music Information Research (MIR).
  Musical rhythm, similar to most musical dimensions, is culture-specific and hence its analysis requires culture-aware approaches. Indian art music is one of the major music traditions of the world and has complexities in rhythm that have not been addressed by the current state of the art in MIR, motivating us to choose it as the primary music tradition for study. Our intent is to address unexplored rhythm analysis problems in Indian art music to push the boundaries of the current MIR approaches by making them cultureaware and generalizable to other music traditions.
  The thesis aims to build data-driven signal processing and machine learning approaches for automatic analysis, description and discovery of rhythmic structures and patterns in audio music collections of Indian art music. After identifying challenges and opportunities, we present several relevant research tasks that open up the field of automatic rhythm analysis of Indian art music. Data-driven approaches require well curated data corpora for research and efforts towards creating such corpora and datasets are documented in detail. We then focus on the topics of meter analysis and percussion pattern discovery in Indian art music.
  Meter analysis aims to align several hierarchical metrical events with an audio recording. Meter analysis tasks such as meter inference, meter tracking and informed meter tracking are formulated for Indian art music. Different Bayesian models that can explicitly incorporate higher level metrical structure information are evaluated for the tasks and novel extensions are proposed. The proposed methods overcome the limitations of existing approaches and their performance indicate the effectiveness of informed meter analysis.
  Percussion in Indian art music uses onomatopoeic oral mnemonic syllables for the transmission of repertoire and technique, providing a language for percussion. We use these percussion syllables to define, represent and discover percussion patterns in audio recordings of percussion solos. We approach the problem of percussion pattern discovery using hidden Markov model based automatic transcription followed by an approximate string search using a data derived percussion pattern library. Preliminary experiments on Beijing opera percussion patterns, and on both tabla and mridangam solo recordings in Indian art music demonstrate the utility of percussion syllables, identifying further challenges to building practical discovery systems.
  The technologies resulting from the research in the thesis are a part of the complete set of tools being developed within the CompMusic project for a better understanding and organization of Indian art music, aimed at providing an enriched experience with listening and discovery of music. The data and tools should also be relevant for data-driven musicological studies and other MIR tasks that can benefit from automatic rhythm analysis.

Thursday, November 17th 2016 at 17:00h in room 55.309 (Tanger Building, UPF Communication Campus)

Sankalp Gulati: “Computational Approaches for Melodic Description in Indian Art Music Corpora”
Thesis director: Xavier Serra
Thesis Committee: Juan Pablo Bello (NYU), Emilia Gómez (UPF) and Barış Bozkurt (Koç Univ.)
[Full thesis document and accompanying materials]

Abstract: Automatically describing contents of recorded music is crucial for interacting with large volumes of audio recordings, and for developing novel tools to facilitate music pedagogy. Melody is a fundamental facet in most music traditions and, therefore, is an indispensable component in such description. In this thesis, we develop computational approaches for analyzing high-level melodic aspects of music performances in Indian art music (IAM), with which we can describe and interlink large amounts of audio recordings. With its complex melodic framework and well-grounded theory, the description of IAM melody beyond pitch contours offers a very interesting and challenging research topic. We analyze melodies within their tonal context, identify melodic patterns, compare them both within and across music pieces, and finally, characterize the specific melodic context of IAM, the ragas. All these analyses are done using data-driven methodologies on sizable curated music corpora. Our work paves the way for addressing several interesting research problems in the field of music information research, as well as developing novel applications in the context of music discovery and music pedagogy.
  The thesis starts by compiling and structuring largest to date music corpora of the two IAM traditions, Hindustani and Carnatic music, comprising quality audio recordings and the associated metadata. From them we extract the predominant pitch and normalize by the tonic context. An important element to describe melodies is the identification of the meaningful temporal units, for which we propose to detect occurrences of nyas svaras in Hindustani music, a landmark that demarcates musically salient melodic patterns.
  Utilizing these melodic features, we extract musically relevant recurring melodic patterns. These patterns are the building blocks of melodic structures in both improvisation and composition. Thus, they are fundamental to the description of audio collections in IAM.We propose an unsupervised approach that employs time-series analysis tools to discover melodic patterns in sizable music collections. We first carry out an in-depth supervised analysis of melodic similarity, which is a critical component in pattern discovery. We then improve upon the best possible competing approach by exploiting peculiar melodic characteristics in IAM. To identify musically meaningful patterns, we exploit the relationships between the discovered patterns by performing a network analysis. Extensive listening tests by professional musicians reveal that the discovered melodic patterns are musically interesting and significant.
  Finally, we utilize our results for recognizing ragas in recorded performances of IAM. We propose two novel approaches that jointly capture the tonal and the temporal aspects of melody. Our first approach uses melodic patterns, the most prominent cues for raga identification by humans. We utilize the discovered melodic patterns and employ topic modeling techniques, wherein we regard a raga rendition similar to a textual description of a topic. In our second approach, we propose the time delayed melodic surface, a novel feature based on delay coordinates that captures the melodic outline of a raga. With these approaches we demonstrate unprecedented accuracies in raga recognition on the largest datasets ever used for this task. Although our approach is guided by the characteristics of melodies in IAM and the task at hand, we believe our methodology can be easily extended to other melody dominant music traditions.
  Overall, we have built novel computational methods for analyzing several melodic aspects of recorded performances in IAM, with which we describe and interlink large amounts of music recordings. In this process we have developed several tools and compiled data that can be used for a number of computational studies in IAM, specifically in characterization of ragas, compositions and artists. The technologies resulted from this research work are a part of several applications developed within the CompMusic project for a better description, enhanced listening experience, and pedagogy in IAM.

17 Oct 2016 - 12:00 | view
Seminar by Dr. Tetsuro Kitahara
11 Oct 2016
Dr. Tetsuro Kitahara from Nihon University will be giving a talk on "From Instrument Recognition to Support of Amateurs' Music Creation" on Tuesday 11 Oct 17:00 in room 52.105.
11 Oct 2016 - 11:29 | view
New students in the SMC Master

In this new academic year 2016-2017, twenty new students have joined the Master in Sound and Music Computing.

Helena Cuesta Mussarra (Spain), Manuel Florencio Olmedo (Spain), Jimmy Jarjoura (Lebanon), Simon Kilmister (UK), Kushagra Sharma (India), Tessy Anne Vera Troes (Luxemburg), Pablo Alonso Jiménez (Spain), Daniel Balcells Eichenberger (Spain), Natalia Delgado Galán (Spain), Manaswi Mishra (India), Nestor Napoles Lopez (Mexico), Minz Sanghee Won (South Korea), Vibhor Bajpai (India), Siddharth Bhardwaj (India), Vsevolod Eremenko (Russia),  Gerard Erruz Lopez (Spain), Joseph Munday (UK), Germán Ruiz Marcos (Spain), Marc Siquier Peñafort (Spain), Meghana Sudhindra (India).

29 Sep 2016 - 15:36 | view