Category Archives: publications

Measuring the Occupational Impact of AI: Tasks, Cognitive Abilities and AI Benchmarks

I am very proud to share the result of a truly interdisciplinary work as part of the HUMAINT project I lead, with Songül Tolan (economist), Annarosa Pesole (social scientist), Fernando Martínez-Plumed (computer scientist), Enrique Fernández-Macías (social scientist), and myself (engineering).


In this paper we develop a framework for analysing the impact of Artificial Intelligence (AI) on occupations. This framework maps 59 generic tasks from worker surveys and an occupational database to 14 cognitive abilities (that we extract from the cognitive science literature) and these to a comprehensive list of 328 AI benchmarks used to evaluate research intensity across a broad range of different AI areas. The use of cognitive abilities as an intermediate layer, instead of mapping work tasks to AI benchmarks directly, allows for an identification of potential AI exposure for tasks for which AI applications have not been explicitly created. An application of our framework to occupational databases gives insights into the abilities through which AI is most likely to affect jobs and allows for a ranking of occupations with respect to AI exposure. Moreover, we show that some jobs that were not known to be affected by previous waves of automation may now be subject to higher AI exposure. Finally, we find that some of the abilities where AI research is currently very intense are linked to tasks with comparatively limited labour input in the labour markets of advanced economies (e.g., visual and auditory processing using deep learning, and sensorimotor interaction through (deep) reinforcement learning).

This article appears in the special track on AI and Society.


Leave a comment

Filed under publications, research

Research on child-robot interaction

In the context of the HUMAINT (Human behaviour and machine intelligence) project, we research on the impact that social robots have on children. In this context, I have had the chance to carry out my first research on the amazing field of child-robot interaction, thanks to the collaboration with Vicky Charisi, Luis Merino and their lab at Universidad Pablo Olavide and Honda Research Institute Japan.

Running a user study with children and robots is very challenging from a technical perspective, and  analysing their data is challenging as well.  We just published in frontiers the result of our first study, where we experimented with two strategies for child-robot interaction in a problem solving task: turn taking and child-initiated interaction, and we showed the need for this voluntary interaction. You can check the details below. It is amazing to learn and contribute to research on this topic!

Child-Robot Collaborative Problem-Solving and the Importance of Child’s Voluntary Interaction: A Developmental Perspective

Vicky Charisi, Emilia Gomez, Gonzalo Mier, Luis Merino and Randy Gomez

Abstract: The emergence and development of cognitive strategies for the transition from exploratory actions towards intentional problem-solving in children is a key question for the understanding of the development of human cognition. Researchers in developmental psychology have studied cognitive strategies and have highlighted the catalytic role of the social environment. However, it is not yet adequately understood how this capacity emerges and develops in biological systems when they perform a problem-solving task in collaboration with a robotic social agent. This paper presents an empirical study in a human-robot interaction (HRI) setting which investigates children’s problem-solving from a developmental perspective. In order to theoretically conceptualize children’s developmental process of problem-solving in HRI context, we use principles based on the intuitive theory and we take into consideration existing research on executive functions with a focus on inhibitory control. We considered the paradigm of the Tower of Hanoi and we conducted an HRI behavioral experiment to evaluate task performance. We designed two types of robot interventions, “voluntary” and “turn-taking”—manipulating exclusively the timing of the intervention. Our results indicate that the children who participated in the voluntary interaction setting showed a better performance in the problem solving activity during the evaluation session despite their large variability in the frequency of self-initiated interactions with the robot. Additionally, we present a detailed description of the problem-solving trajectory for a representative single case-study, which reveals specific developmental patterns in the context of the specific task. Implications and future work are discussed regarding the development of intelligent robotic systems that allow child-initiated interaction as well as targeted and not constant robot interventions.


Leave a comment

Filed under publications, research

Journal paper on AI and Music: Open Questions of Copyright Law and Engineering Praxis

I am very happy to share with you the publication of a truly interdisciplinary study on the impact of AI on music, including considerations from copyright and engineering praxis. It has been an amazing experience to collaborate with scholars in the field of creative practices, engineering and law, and I hope the paper will serve to start discussing some relevant aspects related to the use of AI in music production.


The application of artificial intelligence (AI) to music stretches back many decades, and presents numerous unique opportunities for a variety of uses, such as the recommendation of recorded music from massive commercial archives, or the (semi-)automated creation of music. Due to unparalleled access to music data and effective learning algorithms running on high-powered computational hardware, AI is now producing surprising outcomes in a domain fully entrenched in human creativity—not to mention a revenue source around the globe. These developments call for a close inspection of what is occurring, and consideration of how it is changing and can change our relationship with music for better and for worse. This article looks at AI applied to music from two perspectives: copyright law and engineering praxis. It grounds its discussion in the development and use of a specific application of AI in music creation, which raises further and unanticipated questions. Most of the questions collected in this article are open as their answers are not yet clear at this time, but they are nonetheless important to consider as AI technologies develop and are applied more widely to music, not to mention other domains centred on human creativity.

Keywords: artificial intelligence; music; copyright; engineering; ethics


Paper available in open access at Arts journal.

Leave a comment

Filed under publications, research

OpenBMAT: a new open dataset for music detection with loudness annotations

Last week we announced the publication of OpenBMAT, an open dataset for the tasks of music detection and relative music loudness estimation. The dataset contains 27.4 hours of audio from 8 different TV program types at 4 different countries, cross-annotated by 3 people using 6 different classes. It has been published as a dataset paper at Transaction of the International Society for Music Information Retrieval, the open journal of ISMIR. This research has been carried out as a collaboration between the MTG and BMAT in the context of the industrial Doctorates program of the Catalan Government.

For more information you can read the related news at MTG web site:


Leave a comment

Filed under publications, research, Uncategorized

A new paper on Frontiers journal on Music Conducting

I am very happy to publish this work with Alvaro Sarasúa and Julián Urbano on Frontiers in Digital Humanities about Music Conducting.

The paper, titled “Mapping by Observation: Building a User-Tailored Conducting System From Spontaneous Movements” presents a music interaction system based on the conductor-orchestra metaphor, where the orchestra is considered as an instrument controlled by the movements of the conductor. In the system we proposed the user can control tempo and dynamics and it adapts its mapping to the user by observing spontaneous conducting movements on top of a fixed music. In this respect, we analyze the tendency of people to anticipate or fall behind the beat and the gestures mapped to loudness. The system was evaluated with 24 participants in a discover-by-playing scenario.
Our work was developed in the context of the PHENICX and CASAS research projects and opens interesting directions for creating more intuitive and expressive DMIs, particularly in public installations.
You can access the open publication here.

Leave a comment

Filed under publications, research

TISMIR Journal Launch and Call for Papers

I am blogging some news related to a project I have been recently contributing.

It brings us great pleasure to announce the launch of the first issue of TISMIR, the Transactions of the International Society for Music Information Retrieval

TISMIR was established to complement the widely cited ISMIR conference proceedings and provide a vehicle for the dissemination of the highest quality and most substantial scientific research in MIR. TISMIR retains the Open Access model of the ISMIR Conference proceedings, providing rapid access, free of charge, to all journal content. In order to encourage reproducibility of the published research papers, we provide facilities for archiving the software and data used in the research. The TISMIR publication model avoids excessive cost to the authors or their institutions, with article charges being less than the ISMIR Conference registration fee.

The first issue contains an editorial introducing the journal, four research papers and one dataset paper:

Editorial: Introducing the Transactions of the International Society for Music Information Retrieval – Simon Dixon,  Emilia Gómez,  Anja Volk

Multimodal Deep Learning for Music Genre Classification – Sergio Oramas,  Francesco Barbieri,  Oriol Nieto,  Xavier Serra

Learning Audio–Sheet Music Correspondences for Cross-Modal Retrieval and Piece Identification – Matthias Dorfer,  Jan Hajič jr.,  Andreas Arzt,  Harald Frostel,  Gerhard Widmer

A New Curated Corpus of Historical Electronic Music: Collation, Data and Research Findings – Nick Collins,  Peter Manning,  Simone Tarsitani

A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’ – Joren Six,  Federica Bressan,  Marc Leman

Pop Music Highlighter: Marking the Emotion Keypoints – Yu-Siang Huang,  Szu-Yu Chou,  Yi-Hsuan Yang

Two more papers (one research paper and one overview paper) are in press.

Authors:  We look forward to receiving new submissions to the journal – please see the Call for Papers below.

Best Regards
Simon Dixon, Anja Volk and Emilia Gómez

Editors-in-chief, TISMIR


The ISMIR Board is happy to announce the launch of the Transactions of the International Society for Music Information Retrieval (TISMIR), the open-access journal of our community.


TISMIR ( publishes novel scientific research in the field of Music Information Retrieval (MIR), an interdisciplinary research area concerned with processing, analysing, organising and accessing music information. We welcome submissions from a wide range of disciplines, including computer science, musicology, cognitive science, library & information science, machine learning, and electrical engineering.


TISMIR is established to complement the widely cited ISMIR conference proceedings and provide a vehicle for the dissemination of the highest quality and most substantial scientific research in MIR. TISMIR retains the Open Access model of the ISMIR Conference proceedings, providing rapid access, free of charge, to all journal content. In order to encourage reproducibility of the published research papers, we provide facilities for archiving the software and data used in the research. TISMIR is published in electronic-only format, making it possible to offer very low publication costs to authors’ institutions, while ensuring fully open access content. With this call for papers we invite submissions for the following article types:

Article types

Research articles must describe the outcomes and application of unpublished original research. These should make a substantial contribution to knowledge and understanding in the subject matter and should be supported by relevant experiments.

Overview articles should focus in detail on specific aspects of MIR research. Overview articles will provide a comprehensive review of a broad MIR research problem, a critical evaluation of proposed techniques and/or an analysis of challenges for future research. Papers should critically engage with the relevant body of extant literature.

Datasets should present novel efforts in data gathering and annotation that have a strong potential impact in the way MIR technologies are exploited and evaluated.


If the paper extends or combines the authors’ previously published research, it is expected that there is a significant novel contribution in the submission (as a rule of thumb, we would expect at  least 50% of the underlying work – the ideas, concepts, methods, results, analysis and discussion – to be new). In addition, if there is any overlapping textual material, it should be rewritten.


Review process

The journal operates a double-blind peer review process.  Review criteria include originality, consideration of previous work, methodology, clarity and reproducibility.


Publication frequency

The journal is published online as a continuous volume and issue throughout the year, following an open access policy. Articles are made available as soon as they are ready to ensure that there are no unnecessary delays in getting content publicly available.


Editorial team

Editors in Chief

Simon Dixon, School of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom

Emilia Gómez, Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain

Anja Volk, Department of Information and Computing Sciences, Utrecht University, Netherlands

Editorial Board

Juan P. Bello, Department of Music and Performing Arts Professions, & Department of Electrical and Computer Engineering, New York University, United States

Arthur Flexer, Austrian Research Institute for Artificial Intelligence (OFAI), Austria

Fabien Gouyon, Pandora, United States

Xiao Hu, Faculty of Education, Division of Information & Technology Studies, University of Hong Kong

Olivier Lartillot, Department of Musicology, University of Oslo, Norway

Jin Ha Lee, Information School, University of Washington, United States

Meinard Mueller, International Audio Laboratories Erlangen, Germany

Geoffroy Peeters, Sound Analysis/Synthesis Team, UMR STMS IRCAM CNRS, France

Markus Schedl, Department of Computational Perception, Johannes Kepler University Linz, Austria


Reviewers: The editorial board counts on reviewers from the ISMIR community, who are crucial to the success of the journal. To become a reviewer, please register here

Journal Manager

Tim Wakeford, Ubiquity Press, United Kingdom




Leave a comment

Filed under publications, research, Uncategorized

Paper and dataset for Choir Singing Analysis, presented at ICMPC-ESCOM

Last week, Helena Cuesta, one of the PhD students I am working with, attended the 15th International Conference on Music Perception and Cognition and 10th triennial conference of the European Society for the Cognitive Sciences of Music in Graz (Austria). She presented the following paper in the poster session, as well as a contribution to the proceedings:

Cuesta, H., Gómez, E., Martorell, A., Loáiciga, F. Analysis of Intonation in Unison Choir Singing.

ICMPC/ESCOM is a very multidisciplinary conference, bringing together people from very different fields related to music such as music psychology, music perception, neuroscience, music theory, or music information retrieval.

The study investigates several expressive characteristics of unison choir singing, focusing on how singers blend together and interact with each other in terms of fundamental frequency dispersion, intonation, and vibrato. They also present an open dataset of choral singing that is available here, and was created in collaboration with the Anton Bruckner Choir (Barcelona).

This is a picture of the recording session. This work is being carried out in the context of two research projects: CASAS and TROMPA.



Leave a comment

Filed under datasets, publications, research

New open-access journal Transactions of ISMIR, open for submissions

I am happy to announce that the International Society for Music Information Retrieval launched the Transactions of the International Society for Music Information Retrieval, the open access journal of the ISMIR society at Ubiquity press. I am serving as Editor-in-Chief, together with Simon Dixon and Anja Volk.

TISMIR publishes novel scientific research in the field of music information retrieval (MIR).

We welcome submissions from a wide range of disciplines: computer science, musicology, cognitive science,  library & information science and electrical engineering.

We currently accept submissions.

View our submission guidelines for more information.


Leave a comment

Filed under publications, research

Special Issue at IEEE Multimedia Magazine

I have been collaborating for a while now on the edition of a Special Issue at IEEE Multimedia Magazine, which gathers state-of-the-art research on multimedia methods and technologies aimed at enriching music performance, production and consumption.

I have had the change to co-edit this issue with my colleagues Cynthia Liem (TU Delft, The Netherlands) and George Tzanetakis (University of Victoria, Canada), and I am very happy with the outcomes.

It is the second time I act as a co-editor for a journal (the first one was at JNMR and related to computational ethnomusicology) and I learnt a lot from the process. Editors have to asure good submissions, good reviews and recommendations, keeping the coherence and theme that we wanted to give as a message to our community. Yes: access, distribution and experiences in music are changing with new technologies. I am very happy with the outcomes!  Check our editorial paper here, and the full issue here.

And I love the design!

Captura de pantalla 2017-03-01 a las 10.45.27.png



Leave a comment

Filed under publications, research

Journal paper and open dataset for source separation in Orchestra music

As part of the PHENICX project, we have recently published our research results in the task of audio sound source separation, which is the main research topic of one of our PhD students, Marius Miron.

During this work, we developed a method for orchestral music source separation along with a new dataset: the PHENICX-Anechoic dataset. The methods were integrated into the  PHENICX project for tasks as orchestra focus/instrument enhancement. To our knowledge, this is the first time source separation is objectively evaluated in such a complex scenario. 

This is the complete reference to the paper:

M. Miron, J. Carabias-Orti, J. J. Bosch, E. Gómez and J. Janer, “Score-informed source separation for multi-channel orchestral recordings”, Journal of Electrical and Computer Engineering (2016))”

Abstract: This paper proposes a system for score-informed audio source separation for multichannel orchestral recordings. The orchestral music repertoire relies on the existence of scores. Thus, a reliable separation requires a good alignment of the score with the audio of the performance. To that extent, automatic score alignment methods are reliable when allowing a tolerance window around the actual onset and offset. Moreover, several factors increase the difficulty of our task: a high reverberant image, large ensembles having rich polyphony, and a large variety of instruments recorded within a distant-microphone setup. To solve these problems, we design context-specific methods such as the refinement of score-following output in order to obtain a more precise alignment. Moreover, we extend a close-microphone separation framework to deal with the distant-microphone orchestral recordings. Then, we propose the first open evaluation dataset in this musical context, including annotations of the notes played by multiple instruments from an orchestral ensemble. The evaluation aims at analyzing the interactions of important parts of the separation framework on the quality of separation. Results show that we are able to align the original score with the audio of the performance and separate the sources corresponding to the instrument sections.

The PHENICX-Anechoic dataset includes audio and annotations useful for different MIR tasks as score-informed source separation, score following, multi-pitch estimation, transcription or instrument detection, in the context of symphonic music. This dataset is based on the anechoic recordings described in this paper:

Pätynen, J., Pulkki, V., and Lokki, T., “Anechoic recording system for symphony orchestra,” Acta Acustica united with Acustica, vol. 94, nr. 6, pp. 856-865, November/December 2008.

For more information about the dataset and how to download you can access the PHENICX-Anechoic web page.

Leave a comment

Filed under datasets, publications, research