Category Archives: publications

Research on child-robot interaction

In the context of the HUMAINT (Human behaviour and machine intelligence) project, we research on the impact that social robots have on children. In this context, I have had the chance to carry out my first research on the amazing field of child-robot interaction, thanks to the collaboration with Vicky Charisi, Luis Merino and their lab at Universidad Pablo Olavide and Honda Research Institute Japan.

Running a user study with children and robots is very challenging from a technical perspective, and  analysing their data is challenging as well.  We just published in frontiers the result of our first study, where we experimented with two strategies for child-robot interaction in a problem solving task: turn taking and child-initiated interaction, and we showed the need for this voluntary interaction. You can check the details below. It is amazing to learn and contribute to research on this topic!

Child-Robot Collaborative Problem-Solving and the Importance of Child’s Voluntary Interaction: A Developmental Perspective

Vicky Charisi, Emilia Gomez, Gonzalo Mier, Luis Merino and Randy Gomez

Abstract: The emergence and development of cognitive strategies for the transition from exploratory actions towards intentional problem-solving in children is a key question for the understanding of the development of human cognition. Researchers in developmental psychology have studied cognitive strategies and have highlighted the catalytic role of the social environment. However, it is not yet adequately understood how this capacity emerges and develops in biological systems when they perform a problem-solving task in collaboration with a robotic social agent. This paper presents an empirical study in a human-robot interaction (HRI) setting which investigates children’s problem-solving from a developmental perspective. In order to theoretically conceptualize children’s developmental process of problem-solving in HRI context, we use principles based on the intuitive theory and we take into consideration existing research on executive functions with a focus on inhibitory control. We considered the paradigm of the Tower of Hanoi and we conducted an HRI behavioral experiment to evaluate task performance. We designed two types of robot interventions, “voluntary” and “turn-taking”—manipulating exclusively the timing of the intervention. Our results indicate that the children who participated in the voluntary interaction setting showed a better performance in the problem solving activity during the evaluation session despite their large variability in the frequency of self-initiated interactions with the robot. Additionally, we present a detailed description of the problem-solving trajectory for a representative single case-study, which reveals specific developmental patterns in the context of the specific task. Implications and future work are discussed regarding the development of intelligent robotic systems that allow child-initiated interaction as well as targeted and not constant robot interventions.

 

Leave a comment

Filed under publications, research

Journal paper on AI and Music: Open Questions of Copyright Law and Engineering Praxis

I am very happy to share with you the publication of a truly interdisciplinary study on the impact of AI on music, including considerations from copyright and engineering praxis. It has been an amazing experience to collaborate with scholars in the field of creative practices, engineering and law, and I hope the paper will serve to start discussing some relevant aspects related to the use of AI in music production.

Abstract

The application of artificial intelligence (AI) to music stretches back many decades, and presents numerous unique opportunities for a variety of uses, such as the recommendation of recorded music from massive commercial archives, or the (semi-)automated creation of music. Due to unparalleled access to music data and effective learning algorithms running on high-powered computational hardware, AI is now producing surprising outcomes in a domain fully entrenched in human creativity—not to mention a revenue source around the globe. These developments call for a close inspection of what is occurring, and consideration of how it is changing and can change our relationship with music for better and for worse. This article looks at AI applied to music from two perspectives: copyright law and engineering praxis. It grounds its discussion in the development and use of a specific application of AI in music creation, which raises further and unanticipated questions. Most of the questions collected in this article are open as their answers are not yet clear at this time, but they are nonetheless important to consider as AI technologies develop and are applied more widely to music, not to mention other domains centred on human creativity.

Keywords: artificial intelligence; music; copyright; engineering; ethics

 

Paper available in open access at Arts journal.

Leave a comment

Filed under publications, research

OpenBMAT: a new open dataset for music detection with loudness annotations

Last week we announced the publication of OpenBMAT, an open dataset for the tasks of music detection and relative music loudness estimation. The dataset contains 27.4 hours of audio from 8 different TV program types at 4 different countries, cross-annotated by 3 people using 6 different classes. It has been published as a dataset paper at Transaction of the International Society for Music Information Retrieval, the open journal of ISMIR. This research has been carried out as a collaboration between the MTG and BMAT in the context of the industrial Doctorates program of the Catalan Government.

For more information you can read the related news at MTG web site: https://www.upf.edu/web/mtg/home/-/asset_publisher/sWCQhjdDLWwE/content/id/227864284/maximized#.XXZ_IZMzab8

 

Leave a comment

Filed under publications, research, Uncategorized

A new paper on Frontiers journal on Music Conducting

I am very happy to publish this work with Alvaro Sarasúa and Julián Urbano on Frontiers in Digital Humanities about Music Conducting.

The paper, titled “Mapping by Observation: Building a User-Tailored Conducting System From Spontaneous Movements” presents a music interaction system based on the conductor-orchestra metaphor, where the orchestra is considered as an instrument controlled by the movements of the conductor. In the system we proposed the user can control tempo and dynamics and it adapts its mapping to the user by observing spontaneous conducting movements on top of a fixed music. In this respect, we analyze the tendency of people to anticipate or fall behind the beat and the gestures mapped to loudness. The system was evaluated with 24 participants in a discover-by-playing scenario.
Our work was developed in the context of the PHENICX and CASAS research projects and opens interesting directions for creating more intuitive and expressive DMIs, particularly in public installations.
You can access the open publication here.
fdigh-06-00003-g003

Leave a comment

Filed under publications, research

TISMIR Journal Launch and Call for Papers

I am blogging some news related to a project I have been recently contributing.

It brings us great pleasure to announce the launch of the first issue of TISMIR, the Transactions of the International Society for Music Information Retrievalhttps://transactions.ismir.net/

TISMIR was established to complement the widely cited ISMIR conference proceedings and provide a vehicle for the dissemination of the highest quality and most substantial scientific research in MIR. TISMIR retains the Open Access model of the ISMIR Conference proceedings, providing rapid access, free of charge, to all journal content. In order to encourage reproducibility of the published research papers, we provide facilities for archiving the software and data used in the research. The TISMIR publication model avoids excessive cost to the authors or their institutions, with article charges being less than the ISMIR Conference registration fee.

The first issue contains an editorial introducing the journal, four research papers and one dataset paper:

Editorial: Introducing the Transactions of the International Society for Music Information Retrieval – Simon Dixon,  Emilia Gómez,  Anja Volk

Multimodal Deep Learning for Music Genre Classification – Sergio Oramas,  Francesco Barbieri,  Oriol Nieto,  Xavier Serra

Learning Audio–Sheet Music Correspondences for Cross-Modal Retrieval and Piece Identification – Matthias Dorfer,  Jan Hajič jr.,  Andreas Arzt,  Harald Frostel,  Gerhard Widmer

A New Curated Corpus of Historical Electronic Music: Collation, Data and Research Findings – Nick Collins,  Peter Manning,  Simone Tarsitani

A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’ – Joren Six,  Federica Bressan,  Marc Leman

Pop Music Highlighter: Marking the Emotion Keypoints – Yu-Siang Huang,  Szu-Yu Chou,  Yi-Hsuan Yang

Two more papers (one research paper and one overview paper) are in press.

Authors:  We look forward to receiving new submissions to the journal – please see the Call for Papers below.

Best Regards
Simon Dixon, Anja Volk and Emilia Gómez

Editors-in-chief, TISMIR

CALL FOR PAPERS

The ISMIR Board is happy to announce the launch of the Transactions of the International Society for Music Information Retrieval (TISMIR), the open-access journal of our community.

 

TISMIR (http://tismir.ismir.net) publishes novel scientific research in the field of Music Information Retrieval (MIR), an interdisciplinary research area concerned with processing, analysing, organising and accessing music information. We welcome submissions from a wide range of disciplines, including computer science, musicology, cognitive science, library & information science, machine learning, and electrical engineering.

 

TISMIR is established to complement the widely cited ISMIR conference proceedings and provide a vehicle for the dissemination of the highest quality and most substantial scientific research in MIR. TISMIR retains the Open Access model of the ISMIR Conference proceedings, providing rapid access, free of charge, to all journal content. In order to encourage reproducibility of the published research papers, we provide facilities for archiving the software and data used in the research. TISMIR is published in electronic-only format, making it possible to offer very low publication costs to authors’ institutions, while ensuring fully open access content. With this call for papers we invite submissions for the following article types:

Article types

Research articles must describe the outcomes and application of unpublished original research. These should make a substantial contribution to knowledge and understanding in the subject matter and should be supported by relevant experiments.

Overview articles should focus in detail on specific aspects of MIR research. Overview articles will provide a comprehensive review of a broad MIR research problem, a critical evaluation of proposed techniques and/or an analysis of challenges for future research. Papers should critically engage with the relevant body of extant literature.

Datasets should present novel efforts in data gathering and annotation that have a strong potential impact in the way MIR technologies are exploited and evaluated.

 

If the paper extends or combines the authors’ previously published research, it is expected that there is a significant novel contribution in the submission (as a rule of thumb, we would expect at  least 50% of the underlying work – the ideas, concepts, methods, results, analysis and discussion – to be new). In addition, if there is any overlapping textual material, it should be rewritten.

 

Review process

The journal operates a double-blind peer review process.  Review criteria include originality, consideration of previous work, methodology, clarity and reproducibility.

 

Publication frequency

The journal is published online as a continuous volume and issue throughout the year, following an open access policy. Articles are made available as soon as they are ready to ensure that there are no unnecessary delays in getting content publicly available.

 

Editorial team

Editors in Chief

Simon Dixon, School of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom

Emilia Gómez, Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain

Anja Volk, Department of Information and Computing Sciences, Utrecht University, Netherlands

Editorial Board

Juan P. Bello, Department of Music and Performing Arts Professions, & Department of Electrical and Computer Engineering, New York University, United States

Arthur Flexer, Austrian Research Institute for Artificial Intelligence (OFAI), Austria

Fabien Gouyon, Pandora, United States

Xiao Hu, Faculty of Education, Division of Information & Technology Studies, University of Hong Kong

Olivier Lartillot, Department of Musicology, University of Oslo, Norway

Jin Ha Lee, Information School, University of Washington, United States

Meinard Mueller, International Audio Laboratories Erlangen, Germany

Geoffroy Peeters, Sound Analysis/Synthesis Team, UMR STMS IRCAM CNRS, France

Markus Schedl, Department of Computational Perception, Johannes Kepler University Linz, Austria

 

Reviewers: The editorial board counts on reviewers from the ISMIR community, who are crucial to the success of the journal. To become a reviewer, please register here http://tismir.ubiquitypress.com/author/register/reviewer/

Journal Manager

Tim Wakeford, Ubiquity Press, United Kingdom

Contact

tismir@ismir.net

 

Website

http://tismir.ismir.net/

Leave a comment

Filed under publications, research, Uncategorized

Paper and dataset for Choir Singing Analysis, presented at ICMPC-ESCOM

Last week, Helena Cuesta, one of the PhD students I am working with, attended the 15th International Conference on Music Perception and Cognition and 10th triennial conference of the European Society for the Cognitive Sciences of Music in Graz (Austria). She presented the following paper in the poster session, as well as a contribution to the proceedings:

Cuesta, H., Gómez, E., Martorell, A., Loáiciga, F. Analysis of Intonation in Unison Choir Singing.

ICMPC/ESCOM is a very multidisciplinary conference, bringing together people from very different fields related to music such as music psychology, music perception, neuroscience, music theory, or music information retrieval.

The study investigates several expressive characteristics of unison choir singing, focusing on how singers blend together and interact with each other in terms of fundamental frequency dispersion, intonation, and vibrato. They also present an open dataset of choral singing that is available here, and was created in collaboration with the Anton Bruckner Choir (Barcelona).

This is a picture of the recording session. This work is being carried out in the context of two research projects: CASAS and TROMPA.

bruckner.png

 

Leave a comment

Filed under datasets, publications, research

New open-access journal Transactions of ISMIR, open for submissions

I am happy to announce that the International Society for Music Information Retrieval launched the Transactions of the International Society for Music Information Retrieval, the open access journal of the ISMIR society at Ubiquity press. I am serving as Editor-in-Chief, together with Simon Dixon and Anja Volk.

TISMIR publishes novel scientific research in the field of music information retrieval (MIR).

We welcome submissions from a wide range of disciplines: computer science, musicology, cognitive science,  library & information science and electrical engineering.

We currently accept submissions.

View our submission guidelines for more information.

TISMIR

Leave a comment

Filed under publications, research

Special Issue at IEEE Multimedia Magazine

I have been collaborating for a while now on the edition of a Special Issue at IEEE Multimedia Magazine, which gathers state-of-the-art research on multimedia methods and technologies aimed at enriching music performance, production and consumption.

I have had the change to co-edit this issue with my colleagues Cynthia Liem (TU Delft, The Netherlands) and George Tzanetakis (University of Victoria, Canada), and I am very happy with the outcomes.

It is the second time I act as a co-editor for a journal (the first one was at JNMR and related to computational ethnomusicology) and I learnt a lot from the process. Editors have to asure good submissions, good reviews and recommendations, keeping the coherence and theme that we wanted to give as a message to our community. Yes: access, distribution and experiences in music are changing with new technologies. I am very happy with the outcomes!  Check our editorial paper here, and the full issue here.

And I love the design!

Captura de pantalla 2017-03-01 a las 10.45.27.png

 

 

Leave a comment

Filed under publications, research

Journal paper and open dataset for source separation in Orchestra music

As part of the PHENICX project, we have recently published our research results in the task of audio sound source separation, which is the main research topic of one of our PhD students, Marius Miron.

During this work, we developed a method for orchestral music source separation along with a new dataset: the PHENICX-Anechoic dataset. The methods were integrated into the  PHENICX project for tasks as orchestra focus/instrument enhancement. To our knowledge, this is the first time source separation is objectively evaluated in such a complex scenario. 

This is the complete reference to the paper:

M. Miron, J. Carabias-Orti, J. J. Bosch, E. Gómez and J. Janer, “Score-informed source separation for multi-channel orchestral recordings”, Journal of Electrical and Computer Engineering (2016))”

Abstract: This paper proposes a system for score-informed audio source separation for multichannel orchestral recordings. The orchestral music repertoire relies on the existence of scores. Thus, a reliable separation requires a good alignment of the score with the audio of the performance. To that extent, automatic score alignment methods are reliable when allowing a tolerance window around the actual onset and offset. Moreover, several factors increase the difficulty of our task: a high reverberant image, large ensembles having rich polyphony, and a large variety of instruments recorded within a distant-microphone setup. To solve these problems, we design context-specific methods such as the refinement of score-following output in order to obtain a more precise alignment. Moreover, we extend a close-microphone separation framework to deal with the distant-microphone orchestral recordings. Then, we propose the first open evaluation dataset in this musical context, including annotations of the notes played by multiple instruments from an orchestral ensemble. The evaluation aims at analyzing the interactions of important parts of the separation framework on the quality of separation. Results show that we are able to align the original score with the audio of the performance and separate the sources corresponding to the instrument sections.

The PHENICX-Anechoic dataset includes audio and annotations useful for different MIR tasks as score-informed source separation, score following, multi-pitch estimation, transcription or instrument detection, in the context of symphonic music. This dataset is based on the anechoic recordings described in this paper:

Pätynen, J., Pulkki, V., and Lokki, T., “Anechoic recording system for symphony orchestra,” Acta Acustica united with Acustica, vol. 94, nr. 6, pp. 856-865, November/December 2008.

For more information about the dataset and how to download you can access the PHENICX-Anechoic web page.

Leave a comment

Filed under datasets, publications, research

Paper on melodic similarity in flamenco now online

Our paper on melodic similarity is finally online! The paper is titled

Melodic Contour and Mid-Level Global Features Applied to the Analysis of Flamenco Cantes

This work focuses on the topic of melodic characterization and similarity in a specific musical repertoire: a cappella flamenco singing, more specifically in debla and martinete styles. We propose the combination of manual and automatic description. First, we use a state-of-the-art automatic transcription method to account for general melodic similarity from music recordings. Second, we define a specific set of representative mid-level melodic features, which are manually labelled by flamenco experts. Both approaches are then contrasted and combined into a global similarity measure. This similarity measure is assessed by inspecting the clusters obtained through phylogenetic algorithms and by relating similarity to categorization in terms of style. Finally, we discuss the advantage of combining automatic and expert annotations as well as the need to include repertoire-specific descriptions for meaningful melodic characterization in traditional music collections.

This is the result of a joint work of the COFLA group, where I am contributing with tecnologies for the automatic transcription and melody description of music recordings.

This is an example on how we compare flamenco tonás using melodic similarity and phylogenetic trees:

nnmr_a_1174717_f0007_b

And this is a video example of the type of styles we analyze in this paper, done by Nadine Kroher based on her work at the MTG:

You can read the full paper online:

http://www.tandfonline.com/doi/full/10.1080/09298215.2016.1174717

Leave a comment

Filed under publications, research, Uncategorized