News and Events

Introduction to music therapy and use of ICT in music therapy diagnostic
27 Nov 2013

Wednesday Nov 17th, 2013 in room 52.S29 (in front of the canteen) at 16h

Introduction to music therapy and use of ICT in music therapy diagnostic

ABSTRACT: During the last 50 years European and European influenced music therapy did big steps towards a fixed part in health and social services and sciences. Selected examples from clinical practice will give a brief overview about the scope of the very different application of music therapy approaches in very different fields of clinical practice. Moreover, one field of interdisciplinary research and development combining ICT and music therapy is the field of microanalyses in music therapy. So far qualitative research and standardized observations were used in the last 10 years for research of therapeutic processes in music therapy. All these methods are very time consuming. However, these methods are very important for diagnostic and assessment of music therapy. First developments, i.e. Music Therapy Toolbox on the basis of MIR, are very promising for future use as diagnostic tools in music therapy. The state of the art of this field will be presented and discussed.

BIO: Thomas Wosch is professor of music therapy at the university of applied sciences of Wuerzburg and Schweinfurt in Germany. He is director of Master in Music Therapy for clients with special needs and for clients with dementia. He is head of last year specialisation in music therapy in BA Social Work. He was 10 years music therapist in acute adult psychiatry with focus of treatment of schizophrenia, depression, anxiety disorders and borderline disorder. His special field of research are microanalyses in music therapy (measurement of minimal changes in music therapy processes; see also: Wosch & Wigram (2007) (eds.): Microanalysis in Music Therapy. London & Philadelphia: JKP.). He has research cooperation and international teaching all over Europe, in US, down under and South America. He is Co-editor of and of "Musik und Gesundsein" (music and health).

25 Nov 2013 - 11:14 | view
Tan Özaslan defends his PhD Thesis on November 29th
29 Nov 2013

Tan Özaslan defends his PhD thesis entitled "Computational Analysis of Expressivity in Classical Guitar Performances" on November 29th 2013 at 11:00h in room 52.429 of the Communication Campus of the UPF.

Thesis directors: Josep Lluís Arcos and Xavier Serra
Jury members: Ramon Lopez de Mantaras (IIIA-CSIC), Isabel Barbancho (Universidad de Málaga), Rafael Ramirez (UPF)

Abstract: The study of musical expressivity is an active field in sound and music computing. The research interest comes from different motivations: to understand or model musical expressivity; to identify the expressive resources that characterize an instrument, musical genre, or performer; or to build synthesis systems able to play expressively. To tackle this broad problem, researchers focus on specific instruments and/or musical styles. Hence, in this thesis we focused on the analysis of the expressivity in classical guitar and our aim is to model the use of expressive resources of the instrument. The foundations of all the methods used in this dissertation are based on techniques from the fields of information retrieval, machine learning, and signal processing. We combine several state of the art analysis algorithms in order to deal with modeling the use of the expressive resources. Classical guitar is an instrument characterized by the diversity of its timbral possibilities. Professional guitarists are able to convey a lot of nuances when playing a musical piece. This specific characteristic of classical guitar makes the expressive analysis laborious. In particular we divided our analysis into two main sections. First section provides a tool able to automatically identify expressive resources in the context of real recordings. We build a model in order analyze and automatically extract the tree most used expressive articulations, legato, glissando and vibrato. Second section provides an comprehensive analysis of timing deviations in classical guitar. Timing variations are perhaps the most important ones: they are fundamental for expressive performance and a key ingredient for conferring a human-like quality to machine-based music renditions. However, the nature of such variations is still an open research question, with diverse theories that indicate a multi-dimensional phenomenon. Our system exploits feature extraction and machine learning techniques. Classification accuracies show that timing deviations are accurate predictors of the corresponding piece. To sum up, this dissertation contributes to the field of expressive analysis by providing, an automatic expressive articulation model and a musical piece prediction system by using timing deviations. Most importantly it analyzes the behavior of proposed models by using commercial recordings.

22 Nov 2013 - 17:19 | view
Seminar by Jean-Julien Aucouturier on spectro-temporal receptive fields for MIR
22 Nov 2013
Jean-Julien Aucouturier, from CNRS/IRCAM, gives a seminar on "Spectro-temporal receptive fields (STRFs): a biologically-plausible alternative to MFCCs?" on Friday November 22nd at 15:30h in room 55.410.

Abstract: We describe some recent experiments to adapt a recent computational model of the mammalian auditory cortex to the tasks of Music Information Retrieval. The model, called Spectro-temporal Receptive Fields (STRFs), simulates the responses of auditory cortical neurons as a filterbank of Gabor function tuned on frequencies, but also rates (temporal modulations in Hz) and scales (frequency modulations in cycle/octave). Off the shelf, it provides a 30,000 dimensional feature space; when these dimensions are integrated, we can derive novel signal representations/features that  (1) perform equivalently or better than e.g. Mel-Frequency Cepstrum Coefficients for a task of audio similarity, (2) are somewhat amusing (e.g. dynamic frequency wrapping instead of DTW), and (3) more plausible that the usual MIR features from a biological point of view. 

18 Nov 2013 - 09:46 | view
Martín Haro defends his PhD thesis on November 22nd
22 Nov 2013

Martín Haro defends his PhD thesis entitled "Statistical Distribution of Common Audio Features" on Friday November 22nd 2013 at 11:00h in room 55.309 of the Communication Campus of the UPF.

The jury of the defense is: Josep Lluis Arcos (IIIA-CSIC), Emilia Gómez (UPF) and Jean-Julien Aucouturier (IRCAM).

Abstract: In the last few years some Music Information Retrieval (MIR) researchers have spotted important drawbacks in applying standard successful-inmonophonic algorithms to polyphonic music classification and similarity assessment. Noticeably, these so called “Bag-of-Frames” (BoF) algorithms share a common set of assumptions. These assumptions are substantiated in the belief that the numerical descriptions extracted from short-time audio excerpts (or frames) are enough to capture relevant information for the task at hand, that these frame-based audio descriptors are time independent, and that descriptor frames are well described by Gaussian statistics. Thus, if we want to improve current BoF algorithms we could: i) improve current audio descriptors, ii) include temporal information within algorithms working with polyphonic music, and iii) study and characterize the real statistical properties of these frame-based audio descriptors. From a literature review, we have detected that many works focus on the first two improvements, but surprisingly, there is a lack of research in the third one. Therefore, in this thesis we analyze and characterize the statistical distribution of common audio descriptors of timbre, tonal and loudness information. Contrary to what is usually assumed, our work shows that the studied descriptors are heavy-tailed distributed and thus, they do not belong to a Gaussian universe. This new knowledge led us to propose new algorithms that show improvements over the BoF approach in current MIR tasks such as genre classification, instrument detection, and automatic tagging of music. Furthermore, we also address new MIR tasks such as measuring the temporal evolution of Western popular music. Finally, we highlight some promising paths for future audio-content MIR research that will inhabit a heavy-tailed universe.


15 Nov 2013 - 13:30 | view
Giant steps kick-off: creating the next generation of digital musical tools
21 Nov 2013 - 22 Nov 2013

GiantSteps (Seven League Boots for Music Creation and Performance) is a STREP project funded by the European Commission and coordinated by the MTG in collaboration with JCP Consult, which will last 36 months starting the 1st of November 2013.

The project aims to create the "seven-league boots" for music production in the next decade and beyond. Just as the boots of European folklore tradition enable huge leaps, GiantSteps proposes digital music tools for the near future, to empower all musical users, from professionals to casual musicians and even children. Bringing different types of musical knowledge in the form of musical expert agents – including harmonic, melodic, rhythmic and song structure agents – at different proficiency levels according to the specific needs of the user, the GiantSteps tools seek to stimulate the inspiration of all users, allowing them to create collaboratively, boosting mutual inspiration, helping anyone to learn and discover while creating, and facilitating professionals to work faster, while maintaining their creative flow.

In order to meet this ambitious goal the GiantSteps project has collected a strong balanced consortium representing the whole value chain and including leading agents coming from research institutions (MTG-UPF and Universität Linz JKU), industrial partners (Native Instruments and Reactable Systems) and music practitioners (STEIM and Red Bull Music Academy). With this consortium, and the project coordination by JCP-Consult, GiantSteps will be able to combine techniques and technologies in new and unprecedented ways. This includes the combination of state of the art interfaces and interface design techniques with advanced methods in Music Information Research that have yet to be applied in a real-time interaction  context  nor  with  creativity  objectives.  The  consortium’s  industry organizations will guarantee the alignment of these cutting edge technologies with existing market requirements, allowing for a smooth integration of research outcomes into real world systems. The MTG team, coordinated by Sergi Jordà and Perfecto Herrera, will bring its expertise on music information retrieval (MIR) and advanced interaction.

The GiantSteps kick-off meeting will take place at the MTG next 21st and 22nd of November.

14 Nov 2013 - 17:25 | view
Seminar by Bob Sturm on evaluation in MIR
13 Nov 2013

Bob L. Sturm, from Aalborg University Copenhagen, gives a seminar on "The crisis of evaluation in MIR" on Wednesday November 13th 2013 at 3:30pm in room 55.410.

Abstract: I critically address the "crisis of evaluation" in music information retrieval (MIR), with particular emphasis paid to music genre recognition, music mood recognition, and autotagging. I demonstrate four things: 1) many published results unknowingly use datasets with faults that render them meaningless; 2) state-of-the-art (“high classification accuracy”) systems are fooled by irrelevant factors; 3) most published results are based upon an invalid evaluation design; and 4) a lot of work has unknowingly built, tuned, tested, compared and advertised "horses" instead of solutions. (The example of the horse Clever Hans provides an appropriate illustration.) I argue these problems occur because: 1) many researchers assume a dataset is a good dataset because many others use it; 2) many researchers assume evaluation that is standard in machine learning or information retrieval are useful and relevant for MIR; 3) many researchers mistake systematic, rigorous, and standardized evaluation for being scientific evaluation; and 4) problems and success criteria remain ill-defined, and thus evaluation poor, because researchers do not define appropriate use cases. I show how this "crisis of evaluation" can be addressed by formalizing evaluation in MIR to make clear its aims, parts, design, execution, interpretation, and assumptions. I also present several alternative evaluation approaches that can separate horses from solutions.

8 Nov 2013 - 21:37 | view
Participation to ISMIR 2013

Mohamed Sordo, Alastair Porter, Dmitry Bogdanov, Sertan Şentürk and Xavier Serra participate to the International Society for Music Information Retrieval Conference 2013 (ISMIR 2013) that will take place from November 4th to the 8th 2013 in Curitiba (Brazil). These are the research papers that will be presented:

Mohamed gives a tutorial on "Music Autotagging" and he is the chair of the Late-break/Demos session.

30 Oct 2013 - 18:43 | view
Seminar by Lonce Wyse on Audio and Interaction
31 Oct 2013

Lonce Wyse, from National University of Singapore, gives a talk on 'Audio and Interaction through the Browser' on Thursday Oct 31st, 2013 at 15.30h in room 52.321

Abstract: Emerging web standards are just beginning to address the vast unrealized potential for sound on the browser platform. The first small steps toward a capable sound synthesis system have been made with the Web API, but there are many shortcomings in its performance and usability. With faster engines and good libraries, Javascript has become the lingua franca for client side programming, and is making inroads on the server side as well, though issues remain from the perspective of sound. In this talk, I will discuss recent developments of the web platform for sound, and present several projects from my lab that are moving toward creating an ecosystem of support for sound design, synthesis, interaction, collaboration, and performance with sound on the web.

Biography: Lonce Wyse received his PhD in Cognitive and Neural systems from Boston University in 1994 with a dissertation on pitch perception. He is now an Associate Professor of Communications and New Media at the National University of Singapore. He also directs the Arts and Creativity Lab and an Art/Science Residency Program at the NUS Interactive and Digital Media Institute. He is currently spending a semester sabbatical visiting the Music Technology Group at UPF.

28 Oct 2013 - 18:01 | view
ESSENTIA wins the Open-Source Competition of ACM-Multimedia

ESSENTIA, an audio analysis software library developed at the MTG, wins the Open-Source Software Competition of ACM Multimedia 2013, the worldwide premier multimedia conference that took place in Barcelona from October 21st to 25th 2013. 

The ACM Multimedia Open-Source Software Competition celebrates the invaluable contribution of researchers and software developers who advance the field by providing the community with implementations of codecs, middleware, frameworks, toolkits, libraries, applications, and other multimedia software. The criteria for judging all submissions include broad applicability and potential impact, novelty, technical depth, demo suitability, and other miscellaneous factors (e.g., maturity, popularity, student-led, no dependence on closed source, etc.).

ESSENTIA is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. ESSENTIA is behind a number of commercial applications and has been used by many academic research projects. It is responsible for the audio similarity search functionality of

For more information on ESSENTIA read the paper that was presented at ACM Multimedia. For extensive documentation and for downloading the software go to its website

25 Oct 2013 - 09:01 | view
Participation to the IEEE AASP Challenge

Gerard Roma, Waldo Nogueira and Perfecto Herrera have participated in the Challenge for Detection and Classification of Acoustic Scenes and Events organized by the Audio and Acoustic Signal Processing comittee of the IEEE. One of their submitted algorithms obtained the best accuracy for the scene classification problem involving 10 different acoustic scenes. The MTG algorithm scored well above both the baseline and the other 16 submitted algorithms. The strengths of the approach lie in taking advantage of recurrence quantification analysis features computed from MFCC.
This research is presented during this week at the 2013 Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), that is held in Mohonk Mountain House, New Paltz, NY, USA.

22 Oct 2013 - 12:57 | view