News and Events

Seminar by Diemo Schwarz on tangible and embodied interaction
4 Feb 2016

Diemo Schwarz, from IRCAM, will give a talk on "Tangible and Embodied Interaction on Surfaces, with Mobile Phones, and with Tapioca" on Thursday February 4th 2016 at 3:30pm in room 52.123 of the Communication Campus of the UPF.

Abstract: This talk will present current work of the Sound, Music, Movement Interaction team (ISMM) at Ircam about various ways to enhance interaction with digital sound synthesis or processing through the use of everyday objects and materials. The interaction is tangible and embodied and makes use of a variety of everyday gestures up to expert gestures. We will focus on three examples:

  • First, we show how the timbral potential of intuitive and expressive contact interaction on arbitrary surfaces can be enhanced through latency-free convolution of the microphone signal with grains from a sound corpus.
  • Second, using inertial sensors as found in commodity smartphones to produce sound depending on the devices' motion allows to leverage the creative potential and group interaction of the general public, as shown in the CoSiMa project (Collaborative Situated Media) based on the Web Audio API.
  • Third, with granular or liquid interaction material as in the DIRTI tangible interfaces, we forego the dogma of repeatability in favor of a richer and more complex experience, creating music with expressive gestures, molding sonic landscapes by plowing through tapioca beads.
1 Feb 2016 - 22:12 | view
Álvaro Sarasúa at Dutch National TV: Orchestral music conducting
21 Jan 2016

Yesterday, "De Kennis van Nu", a science TV show in National Dutch TV, made a program about orchestral music conducting.

Álvaro Sarasúa, PhD student both at the Escola Superior de Música de Catalunya and the MIRlab of the MTG, and involved in the PHENICX project, explained a bit about real-time motion capture feature extraction and estimation of articulation from body motion. Our PHENICX colleague Cynthia Liem (TUDelft) also explained some research work on the comparison of different performance styles.

You can watch the whole show here (they appear from min 13:50).



22 Jan 2016 - 13:39 | view
Six PhD thesis defenses in a period of two week at the MTG

In a period of two weeks there are six defenses of PhD thesis carried out at the MTG !!!. They are:

Friday, January 22nd 2016 at 15:30h in room 55.410 (Tanger Building, UPF Communication Campus)
Josep M Comajuncosas: “Assessing creativity in computer music ensembles: a computational approach”
Thesis director: Sergi Jordà; Defense Jury: Xavier Serra (UPF), Enric Guaus (ESMUC) and Atau Tanaka (Goldsmith Univ.)

Over the last decade Laptop Orchestras and Mobile Ensembles have proliferated. As a result, a large body of research has arisen on infrastructure, evaluation, design principles and compositional methodologies for Computer Music Ensembles (CME).
However, little has been addressed and very little is known about the challenges and opportunities provided by CMEs for creativity in musical performance. Therefore, one of the most common issues CMEs have to deal with is the lack of a systematic approach to handle the implications of the performative paradigms they seek to explore, in terms of their creative constraints and affordances. This is the challenging goal this thesis addresses, and for attaining so it first seeks to find a common ground in the strategies developed for assessing creativity in different performative setups, for later proposing an informed pathway for performative engagement in CMEs.
Our research combines an exploratory stage and an experimental stage. The exploratory stage was informed by out artistic praxis with our own CME, the Barcelona Laptop Orchestra. Through the study of the multi-user instruments developed over the past years, we identified the creative constraints and affordances provided by different performative paradigms. Informed by the findings provided by our artistic research, the experimental stage addressed the study of musical creativity through the performance analysis on specifically designed multi-user instruments. For such purpose we proposed a novel computational methodology to evaluate the creative content of a musical performance.
Two experiments were conducted to incorporate our computational methodology into ecologically valid scenarios, aimed at a better understanding of the relationship between topologies of interdependence and creative outcome. For both experiments, we captured performance data from ensemble improvisations, from where the creativity metrics were then computed. As a preliminary step, we investigated the performative engagement and sharing of musical ideations in an ensemble scenario. In a further step, we computed the creativity attributes to comparatively evaluate performances under different scenarios.
The findings provided quantitative evidence of the differences between musical creativity in individual, ensemble and interdependent scenarios. Additionally, the findings point out what strategies performers adopt to best keep their own musical voice in interdependence scenarios, and what novel creative behaviors may be promoted through new topologies of interdependence. Our findings shed light on the nature of performers’ creative behavior with interdependent multi-user instruments, and show that the introduced methodology can have applications in the broader context of analysis of creativity in musical performance.

Friday, January 29th 2016, at 11h in room 55.309 (Tanger Building, UPF Communication Campus)
Martí Umbert: “Expression Control of Singing Voice Synthesis: Modeling Pitch and Dynamics with Unit Selection and Statistical Approaches”
Thesis directors: Jordi Bonada and Xavier Serra; Defense Jury: Rafael Ramírez (UPF), Josep Lluís Arcos (CSIC-IIIA) and Roberto Bresin (KTH).

Sound synthesis technologies have been applied to speech, instruments, and singing voice. While these technologies need to have a sound representation as realistic as possible, the sound synthesis should also reproduce the expressive characteristics of the original sound. This, we refer to emotional speech synthesis, expressive performances of synthesized instruments, as well as expression in singing voice synthesis. Indeed, the singing voice has some commonalities with both speech (the sound source is the same) and instruments (concerning musical aspects such as melody and expression resources). 
Modeling singing voice expression is a dicult task. We are completely familiarized with the singing voice instrument, and thus we easily detect whether articially achieved results are similar to a real singer or not. There are many features that should be controlled related to melody, dynamics, rhythm, and timbre, which make achieving natural expression a complex task.
This thesis focuses on the control of a singing voice synthesizer to achieve natural expression similar to a real singer. In this thesis we examine the control of pitch and dynamics. In the unit selection-based system we define the cost functions for unit selection as well as the unit transformations and concatenation steps. The statistically-based systems model both sequences of notes and sequences of note transitions and sustains. Finally, we also present a system which combines the previous ones. These systems are trained with two expression databases that we have designed, recorded, and labeled. These databases comprise sequences of three notes or rests.
Our perceptual evaluation compares the proposed systems with a baseline expression system and a performance-driven approach. The perceptual evaluation shows that the hybrid systems achieves the closest natural expression to a human voice. In the objective evaluation we focus on the systems eficiency.
This thesis delivers numerous contributions to the eld of our research: 1) it provides a discussion on expression and summarizes some expression definitions, 2) it reviews previous works on expression control in singing voice synthesis, 3) it provides an online compilation of sound excerpts from dierent works, 4) it proposes a methodology for expression database creation, 5) it implements a unit selection-based system for expression control, 6) it proposes two statistical-based systems, 7) it presents a hybrid system, 8) it compares the proposed systems with other state of the art systems, 9) it proposes another use case in which the proposed systems can be applied, 10) it provides a set of proposals to improve the evaluation.

Monday, February 1st 2016, at 16h in room 55.309 (Tanger Building, UPF Communication Campus)
Panagiotis Papiotis: “A Computational Approach to Studying Interdepence in String Quartet Performance”
Thesis directors: Esteban Maestre and Xavier Serra; Defense Jury: Werner Goebl (Univ. Music and Performing Arts, Vienna), Ralph Andrzejak (UPF) and Josep Lluís Arcos (CSIC)

This dissertation proposes a computational data-driven methodology to measure music ensemble interdependence - the degree to which musicians interact and influence each other’s actions in order to achieve a shared goal - using a string quartet ensemble as a case study.
We present the outcomes of an experiment on music ensemble interdependence, where we recorded the members of a professional string quartet performing exercises and excerpts of musical pieces in two conditions: solo, where each musician performs their part alone, and ensemble, where the entire quartet performs together following a short rehearsal. During the performance we acquire multimodal data in the form of audio recordings, motion capture data from sound-producing movements and upper body movement, as well as high quality video. All of the recorded data have been published online as an open research dataset.
From the acquired data, we extract numerical features in the form of time series that describe performance in terms of four distinct musical dimensions: intonation, dynamics, timbre, and timing. We apply four different interdependence estimation methods based on time series analysis - Pearson correlation, Mutual Information, Granger causality and Nonlinear Coupling coefficient - to the extracted features in order to assess the overall level of interdependence between the four musicians for each performance dimension individually. We then carry out a statistical comparison of interdependence estimated for the ensemble and solo conditions.
Our results show that it is possible to correctly discriminate between the two experimental conditions for each of the studied performance dimensions. By computing the difference in estimated interdependence between the ensemble and solo condition for a given performance dimension, we are also able to compare across different recordings in terms of the established interdependence and relate the results to the underlying goal of the exercise.
We additionally study the aural perception of music ensemble interdependence, assessing the capability of listeners to distinguish between audio recordings of ensemble performances and artificially synchronized solo performances as a function of the listeners’ own background and the performance dimension that each recording focused on.
The proposed methodology and obtained results explore a novel direction for research on music ensemble interdependence that goes beyond temporal synchronization and towards a broader understanding of joint action in music performance, while the shared dataset provides a valuable resource that serves as a foundation for future studies to build upon.

Friday, February 5th 2016, at 11h in room 55.309 (Tanger Building, UPF Communication Campus)
Graham Coleman: “Descriptor Control of Sound Transformations and Mosaicing Synthesis”
Thesis directors: Xavier Serra and Jordi Bonada; Defense Jury: Rarafel Ramírez (UPF), Josep Lluís Arcos (CSIC) and Bob Sturm (QMUL-UK) 

Sampling, as a musical or synthesis technique, is a way to reuse recorded musical expressions. In this dissertation, several ways to expand sampling synthesis are explored, especially mosaicing synthesis, which imitates target signals by transforming and compositing source sounds, in the manner of a mosaic made of broken tile.
One branch of extension consists of the automatic control of sound transformations towards targets defined in a perceptual space. The approach chosen uses models that predict how the input sound will be transformed as a function of the selected parameters. In one setting, the models are known, and numerical search can be used to find sucient parameters; in the other, they are unknown and must be learned from data.
Another branch focuses on the sampling itself. By mixing multiple sounds at once, perhaps it is possible to make better imitations, e.g. in terms of the harmony of the target. However, using mixtures leads to new computational problems, especially if properties like continuity, important to high quality sampling synthesis, are to be preserved.
A new mosaicing synthesizer is presented which incorporates all of these elements: supporting automatic control of sound transformations using models, mixtures supported by perceptually relevant harmony and timbre descriptors, and preservation of continuity of the sampling context and transformation parameters. Using listening tests, the proposed hybrid algorithm was compared against classic and contemporary algorithms, and the hybrid algorithm performed well on a variety of quality measures.
Friday, February 5th 2016, at 15h in room 55.309 (Tanger Building, UPF Communication Campus)
Stefan Kersten: “Statistical modelling and resynthesis of environmental texture sounds”
Thesis directors: Xavier Serra and Hendrik Purwins; Defense Jury: Rarafel Ramírez (UPF), Enric Guaus (ESMUC) and Diemo Schwarz (IRCAM) 
Environmental texture sounds are an integral, though often overlooked, part of our daily life. They constitute those elements of our sounding environment that we tend to perceive subconsciously but which we miss when they are missing. Those sounds are also increasingly important for adding realism to virtual environments, from immersive artificial worlds through computer games to mobile augmented reality systems. This work spans the spectrum from data-driven stochastic sound synthesis methods to distributed virtual reality environments and their aesthetic and technological implications. We propose a framework for statistically modelling environmental texture sounds in different sparse signal representations. We explore three different instantiations of this framework, two of which constitute a novel way of representing texture sounds in a physicallyinspired sparse statistical model and of estimating model parameters from recorded sound examples. We propose a new method of creatively interacting with corpuses of sound segments that are organised in a twodimensional space and evaluate our work in a teaching context for musicians and sound artists. Finally, we describe two different authoring and simulation environments for creating sonic landscapes for virtual reality environments and augmented audio reality applications. These systems serve as a test bed for exploring possible applications of environmental texture sound synthesis models. We evaluate the validity of the developed systems in the context of a prototype virtual reality environment and within the commercial setting of a mobile location based audio platform. We also introduce a novel sound synthesis engine that serves as the basis for realtime rendering of large soundscapes. Its performance is evaluated in the context of a commercial location based audio platform that is in daily use by content producers and end users. In summary, this thesis contributes to the advancement of the state of the art in statistical modelling of environmental sound textures by exploring novel ways of representing those sounds in a sparse setting. Our research also significantly contributed to the succesfull realisation of an innovative location based audio platform.

Wednesday, Februrary 10th 2016, at 10am in room 55.309 (Tanger Building, UPF Communication Campus)
Sebastián Mealla: “Designing Sonic Interactions for Implicit Physiological Computing”
Thesis directors: Sergi Jordà and Aleksander Väljamäe; Defense Jury: Rafael Ramírez (UPF), Aureli Soria-Frisch (starlab) and Wendy Ju (Stanford Univ.)

The field of Human-Computer Interaction (HCI) has been historically devoted to understand the interplay between people and computers. However, for the last three decades, it has been mainly based on overt and explicit control by means of peripheral devices such as the keyboard and the mouse. As devices and systems are becoming increasingly complex and powerful, this traditional approach to interface design is often lagging behind, constituting a bottleneck for seamless HCI.
In order to achieve more natural interactions with computer systems, HCI has to go beyond explicit control and incorporate the implicit subtleties of human-human interaction. This could be achieved by means of Physiological Computing, which monitors naturalistic changes in the user psychophysiological states (affective, perceptive or cognitive) for adapting system responses without explicit control. At the output level, Sonic Interaction Design (SID) appears as an excellent medium for representing implicit physiological states, as acoustic data can be processed faster than visual presentation, can be easily localized in space, it has a good temporal resolution, and account for displaying multiple data streams while releasing the visual sense.
Therefore, in this dissertation we aim to conceptualize, prototype and evaluate sonic interaction designs for implicit Physiological Computing in the context of HCI. For achieving this goal, we leverage on physiological sensing techniques, namely EEG and ECG, to estimate user’s implicit states in real time, and apply diverse SID methodologies to adapt system responses according to these statuses. We incrementally develop different implicit sonic interactions (from direct audification to complex musical mappings) and evaluate them in HCI scenarios (from neurofeedback to music performance), assessing their perceptualization quality, the role of mapping complexity, and their meaningfulness in the musical domain.
14 Jan 2016 - 16:48 | view
Emilia Gómez becomes President-elect of the International Society for Music Information Retrieval (ISMIR)

Since January 1st 2016, Emilia Gómez is the president-elect of the International Society for Music Information Retrieval. She was elected at the business meeting of the annual ISMIR conference last October 2015 in Málaga. The current board also includes Fabien Gouyon (president), Eric J. Humphrey (secretary), Xiao Hu (treasurer), and Amélie Anglade, Meinard Müller and Geoffroy Peeters as board members.

The International Society for Music Information Retrieval is a non-profit organization seeking to advance the access, organization, and understanding of music information. As a field, music information retrieval (MIR) focuses on the research and development of computational systems to help humans better make sense of this data, drawing from a diverse set of disciplines, including, but my no means limited to, music theory, computer science, psychology, neuroscience, library science, electrical engineering, and machine learning.

The main activity of ISMIR is happening at the annual ISMIR international conference, with an attendance of 200 to 300 people and around 100 papers published in each edition, with decreasing acceptance rate.  ISMIR papers have a great impact: ISMIR is currently the 5th ranked publication in the “Multimedia” subcategory of “Engineering and Computer Science” and the 1st ranked in the “Music&Musicology” subcategory of “Humanities, Literature, and Arts".

More info here

13 Jan 2016 - 18:46 | view
Presentation of the AudioCommons Initiative
20 Jan 2016
Wednesday January 20th 2016 from 10am to 1:30pm in room 55.309 of the Communication Campus of the UPF (Roc Boronat 138, Barcelona)
AudioCommons ( is a new initiative supported by the European Commission through the Horizon 2020 programme that aims at bringing Creative Commons audio content to the creative industries. The consortium of the project, lead by the Music Technology Group of the UPF, will promote the use of open audio content and develop technologies with which to support the ecosystem composed by content repositories, production tools and users. These technologies will enable the reuse of this audio material, facilitating its integration in the production workflows used by the creative industries. In this presentation we will go over the core ideas behind this initiative and each of the partners of the project will introduce what they are doing and will do in relation to AudioCommons.
10h MTG, Universitat Pompeu Fabra - Xavier Serra and Frederic Font
Presentation of the key concepts and goals of the AudioCommons project and of the envisioned Audio Commons Ecosystem. Overview the AudioCommons-related research being carried out at the MTG and the different projects and technologies that will be connected into the Audio Commons Ecosystem: Freesound, AcousticBrainz and Essentia.
10:30h CoDE / IoSR / CVSSP, University of Surrey - Mark D. Plumbley and Tim Brookes
Presentations of the three centers involved, Centre for the Digital Economy (CoDE), Institute of Sound Recording (IoSR), and Centre for Vision Speech and Signal Processing (CVSSP), and of the research to be done in AudioCommons. CoDE will investigate and develop frameworks to understand and analyse emerging digital business models in audio and the creative industries, IoSR will develop models of timbral perception to allow automated semantic annotation of non-musical content. CVSSP will develop novel signal processing algorithms to facilitate automated semantic annotation of non-musical content.
11h C4DM, Queen Mary University of London - George Fazekas and Mathieu Barthet
Review of the research and tools developed at the C4DM related to the AudioCommons initiative, such as the Music Ontology, the Vamp audio analysis framework, Sonic Visualiser, Sonic Annotator, or the Web-based environment SAWA. Presentation of the recent work in affective computing, which aims at annotating music by mood using audio and social information. 
11:30h Coffee break
12h Jamendo  - Martin Guerber and Samuel Devyver
Brief overview  of Jamendo (history, philosophy, current directions) and presentation of Jamendo Music (platform for music lovers to discover and enjoy music freely, and for artists to promote themselves and make money) and Jamendo Licensing (marketplace to sell royalty-free music licenses for commercial use). Discussion of the current approach for dealing with music metadata and audio content qualification as it relates to the AudioCommons project.
12:30h AudioGaming  - Benjamin Lévy
Presentation of AudioGaming and its philosophy (initial starting idea, recent evolutions) and a presentation of the tools that have been already developed and that are planned for development within the AudioCommons project and that could be seamlessly linked to Creative Commons audio databases. Discussion on the integration and Creative Commons usage challenges from an interactive/video game perspective.
13h Waves - Yuvai Levi
Presentation of Waves and of a working prototype of a standalone application which can browse and query online audio repositories acting as a web server, and an audio plugin which communicates with the former web server to enable queries from the plugin itself. Discussion over the possibilities of presenting/serving the rights issues of the sample owners, and some other ideas regarding the usage of automatic audio analysis algorithms for the sake of improving search results of audio in online repositories which do not support advanced search functionalities.
12 Jan 2016 - 16:45 | view
Recommender Systems Handbook 2015

Springer has published the second edition of Recommender Systems Handbook. It includes a chapter on music recommender systems written in collaboration between Markus Schedl and Peter Knees (Johannes Kepler University), Brian McFee (New York University), Dmitry Bogdanov (Universitat Pompeu Fabra) and Marius Kaminskas (University College Cork).

Schedl, M., Knees, P., McFee, B., Bogdanov, D., & Kaminskas, M. (2015). Music Recommender Systems. In Recommender Systems Handbook (pp. 453-492). Springer US. 

Abstract: "This chapter gives an introduction to music recommender systems research. We highlight the distinctive characteristics of music, as compared to other kinds of media. We then provide a literature survey of content-based music recommendation, contextual music recommendation, hybrid methods, and sequential music recommendation, followed by overview of evaluation strategies and commonly used data sets. We conclude by pointing to the most important challenges faced by music recommendation research."

7 Jan 2016 - 17:15 | view
Post-doctoral positions at the MTG
There are a number of possibilities to do a post-doc at the MTG, in particular:
1. Post-doctoral position involving undergratuate teaching at the UPF and participation on a research project of the MTG. This position will require to be able to teach audio related undergraduate courses and be able to contribute to one of the funded projects of the MTG. To apply to this position or to get more information on it, please send an email to mtg [at] upf [dot] edu
2. Ramon y Cajal 2015. Post-doctoral positions funded by the Spanish government with which you can join a spanish research group like the MTG. For information and application:
3. Juan de la Cierva 2015. Post-doctoral positions for young doctors funded by the Spanish government with which you can join a spanish research group like the MTG. For information and application:
23 Dec 2015 - 18:09 | view
Emilia Gómez at "Dones i TIC / Women & ICT" award

Last week, Emilia Gómez was mentioned in the 2015 #12x12donatic awards to Women in Information & Technologies, in the category "research & academia".

The goal of these awards is to recognize the role of women in professional, business and academic domains related to information and communication technologies.  They are organized by Tertulia digitalidigital, an initiative from the Catalan Government  for digital innovation, the observatory for women, entreprise and economy of Cambra de Commerç de Barcelona and Sinergia digital marketing.

In the speeches, all women were passionate about their work and  mentioned our wish that we won't need prizes like that in the future as the women will be very present in everyday media. 

14 Dec 2015 - 14:48 | view
Application open for the Master in Sound and Music Computing 2016-2017
30 Nov 2015 - 10 Jun 2016

The application for the Master in Sound and Music Computing, program 2016-2017, is open on-line. There are 4 application periods (deadlines: January 15th, March 4th, April 22nd, June 10th). For more information on the UPF master programs and on how to register to the SMC Master check here. For other information on the SMC master check:

1 Dec 2015 - 22:52 | view
Live performances of Chinese traditional music in Barcelona!
10 Dec 2015 - 11 Dec 2015

The CompMusic project, in collaboration with Casa Asia, the Barcelona Confucius Institute Foundation and the Conservatori Municipal de Música de Barcelona, is organising two sessions of Chinese traditional music in Barcelona. On December 10th four professional musicians from the London based Silk & Bamboo Ensemble will perform several pieces from the traditional instrumental repertoire. The following day, December 11th, the UK-Chinese Opera Association, also based in London, will send six of its members to Barcelona to perform two scenes from traditional jingju plays, The Forest of Wild Boars and The Fourth Son Visits his Mother. Each session will be preceded by workshops given in English by members of each group, in which the main features of each music tradition will be presented, including some live demos. Workshops start at 19:00 and concerts at 20:00 in the Auditori Eduard Toldrà, at the Conservatori Municipal de Música de Barcelona (Carrer del Bruc, 112), and the entrance is free!

With the support of: European Research Council

1 Dec 2015 - 15:18 | view