Category Archives: publications
I am blogging some news related to a project I have been recently contributing.
It brings us great pleasure to announce the launch of the first issue of TISMIR, the Transactions of the International Society for Music Information Retrieval, https://transactions.ismir.net/
TISMIR was established to complement the widely cited ISMIR conference proceedings and provide a vehicle for the dissemination of the highest quality and most substantial scientific research in MIR. TISMIR retains the Open Access model of the ISMIR Conference proceedings, providing rapid access, free of charge, to all journal content. In order to encourage reproducibility of the published research papers, we provide facilities for archiving the software and data used in the research. The TISMIR publication model avoids excessive cost to the authors or their institutions, with article charges being less than the ISMIR Conference registration fee.
The first issue contains an editorial introducing the journal, four research papers and one dataset paper:
Editorial: Introducing the Transactions of the International Society for Music Information Retrieval – Simon Dixon, Emilia Gómez, Anja Volk
Multimodal Deep Learning for Music Genre Classification – Sergio Oramas, Francesco Barbieri, Oriol Nieto, Xavier Serra
Learning Audio–Sheet Music Correspondences for Cross-Modal Retrieval and Piece Identification – Matthias Dorfer, Jan Hajič jr., Andreas Arzt, Harald Frostel, Gerhard Widmer
A New Curated Corpus of Historical Electronic Music: Collation, Data and Research Findings – Nick Collins, Peter Manning, Simone Tarsitani
A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’ – Joren Six, Federica Bressan, Marc Leman
Pop Music Highlighter: Marking the Emotion Keypoints – Yu-Siang Huang, Szu-Yu Chou, Yi-Hsuan Yang
Two more papers (one research paper and one overview paper) are in press.
Authors: We look forward to receiving new submissions to the journal – please see the Call for Papers below.
Simon Dixon, Anja Volk and Emilia Gómez
CALL FOR PAPERS
The ISMIR Board is happy to announce the launch of the Transactions of the International Society for Music Information Retrieval (TISMIR), the open-access journal of our community.
TISMIR (http://tismir.ismir.net) publishes novel scientific research in the field of Music Information Retrieval (MIR), an interdisciplinary research area concerned with processing, analysing, organising and accessing music information. We welcome submissions from a wide range of disciplines, including computer science, musicology, cognitive science, library & information science, machine learning, and electrical engineering.
TISMIR is established to complement the widely cited ISMIR conference proceedings and provide a vehicle for the dissemination of the highest quality and most substantial scientific research in MIR. TISMIR retains the Open Access model of the ISMIR Conference proceedings, providing rapid access, free of charge, to all journal content. In order to encourage reproducibility of the published research papers, we provide facilities for archiving the software and data used in the research. TISMIR is published in electronic-only format, making it possible to offer very low publication costs to authors’ institutions, while ensuring fully open access content. With this call for papers we invite submissions for the following article types:
Research articles must describe the outcomes and application of unpublished original research. These should make a substantial contribution to knowledge and understanding in the subject matter and should be supported by relevant experiments.
Overview articles should focus in detail on specific aspects of MIR research. Overview articles will provide a comprehensive review of a broad MIR research problem, a critical evaluation of proposed techniques and/or an analysis of challenges for future research. Papers should critically engage with the relevant body of extant literature.
Datasets should present novel efforts in data gathering and annotation that have a strong potential impact in the way MIR technologies are exploited and evaluated.
If the paper extends or combines the authors’ previously published research, it is expected that there is a significant novel contribution in the submission (as a rule of thumb, we would expect at least 50% of the underlying work – the ideas, concepts, methods, results, analysis and discussion – to be new). In addition, if there is any overlapping textual material, it should be rewritten.
The journal operates a double-blind peer review process. Review criteria include originality, consideration of previous work, methodology, clarity and reproducibility.
The journal is published online as a continuous volume and issue throughout the year, following an open access policy. Articles are made available as soon as they are ready to ensure that there are no unnecessary delays in getting content publicly available.
Editors in Chief
Simon Dixon, School of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom
Emilia Gómez, Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain
Anja Volk, Department of Information and Computing Sciences, Utrecht University, Netherlands
Juan P. Bello, Department of Music and Performing Arts Professions, & Department of Electrical and Computer Engineering, New York University, United States
Arthur Flexer, Austrian Research Institute for Artificial Intelligence (OFAI), Austria
Fabien Gouyon, Pandora, United States
Xiao Hu, Faculty of Education, Division of Information & Technology Studies, University of Hong Kong
Olivier Lartillot, Department of Musicology, University of Oslo, Norway
Jin Ha Lee, Information School, University of Washington, United States
Meinard Mueller, International Audio Laboratories Erlangen, Germany
Geoffroy Peeters, Sound Analysis/Synthesis Team, UMR STMS IRCAM CNRS, France
Markus Schedl, Department of Computational Perception, Johannes Kepler University Linz, Austria
Reviewers: The editorial board counts on reviewers from the ISMIR community, who are crucial to the success of the journal. To become a reviewer, please register here http://tismir.ubiquitypress.com/author/register/reviewer/
Tim Wakeford, Ubiquity Press, United Kingdom
Last week, Helena Cuesta, one of the PhD students I am working with, attended the 15th International Conference on Music Perception and Cognition and 10th triennial conference of the European Society for the Cognitive Sciences of Music in Graz (Austria). She presented the following paper in the poster session, as well as a contribution to the proceedings:
Cuesta, H., Gómez, E., Martorell, A., Loáiciga, F. Analysis of Intonation in Unison Choir Singing.
ICMPC/ESCOM is a very multidisciplinary conference, bringing together people from very different fields related to music such as music psychology, music perception, neuroscience, music theory, or music information retrieval.
The study investigates several expressive characteristics of unison choir singing, focusing on how singers blend together and interact with each other in terms of fundamental frequency dispersion, intonation, and vibrato. They also present an open dataset of choral singing that is available here, and was created in collaboration with the Anton Bruckner Choir (Barcelona).
I am happy to announce that the International Society for Music Information Retrieval launched the Transactions of the International Society for Music Information Retrieval, the open access journal of the ISMIR society at Ubiquity press. I am serving as Editor-in-Chief, together with Simon Dixon and Anja Volk.
TISMIR publishes novel scientific research in the field of music information retrieval (MIR).
We welcome submissions from a wide range of disciplines: computer science, musicology, cognitive science, library & information science and electrical engineering.
We currently accept submissions.
View our submission guidelines for more information.
I have been collaborating for a while now on the edition of a Special Issue at IEEE Multimedia Magazine, which gathers state-of-the-art research on multimedia methods and technologies aimed at enriching music performance, production and consumption.
It is the second time I act as a co-editor for a journal (the first one was at JNMR and related to computational ethnomusicology) and I learnt a lot from the process. Editors have to asure good submissions, good reviews and recommendations, keeping the coherence and theme that we wanted to give as a message to our community. Yes: access, distribution and experiences in music are changing with new technologies. I am very happy with the outcomes! Check our editorial paper here, and the full issue here.
And I love the design!
As part of the PHENICX project, we have recently published our research results in the task of audio sound source separation, which is the main research topic of one of our PhD students, Marius Miron.
During this work, we developed a method for orchestral music source separation along with a new dataset: the PHENICX-Anechoic dataset. The methods were integrated into the PHENICX project for tasks as orchestra focus/instrument enhancement. To our knowledge, this is the first time source separation is objectively evaluated in such a complex scenario.
This is the complete reference to the paper:
M. Miron, J. Carabias-Orti, J. J. Bosch, E. Gómez and J. Janer, “Score-informed source separation for multi-channel orchestral recordings”, Journal of Electrical and Computer Engineering (2016))”
Abstract: This paper proposes a system for score-informed audio source separation for multichannel orchestral recordings. The orchestral music repertoire relies on the existence of scores. Thus, a reliable separation requires a good alignment of the score with the audio of the performance. To that extent, automatic score alignment methods are reliable when allowing a tolerance window around the actual onset and offset. Moreover, several factors increase the difficulty of our task: a high reverberant image, large ensembles having rich polyphony, and a large variety of instruments recorded within a distant-microphone setup. To solve these problems, we design context-specific methods such as the refinement of score-following output in order to obtain a more precise alignment. Moreover, we extend a close-microphone separation framework to deal with the distant-microphone orchestral recordings. Then, we propose the first open evaluation dataset in this musical context, including annotations of the notes played by multiple instruments from an orchestral ensemble. The evaluation aims at analyzing the interactions of important parts of the separation framework on the quality of separation. Results show that we are able to align the original score with the audio of the performance and separate the sources corresponding to the instrument sections.
The PHENICX-Anechoic dataset includes audio and annotations useful for different MIR tasks as score-informed source separation, score following, multi-pitch estimation, transcription or instrument detection, in the context of symphonic music. This dataset is based on the anechoic recordings described in this paper:
Pätynen, J., Pulkki, V., and Lokki, T., “Anechoic recording system for symphony orchestra,” Acta Acustica united with Acustica, vol. 94, nr. 6, pp. 856-865, November/December 2008.
Our paper on melodic similarity is finally online! The paper is titled
Melodic Contour and Mid-Level Global Features Applied to the Analysis of Flamenco Cantes
This is the result of a joint work of the COFLA group, where I am contributing with tecnologies for the automatic transcription and melody description of music recordings.
This is an example on how we compare flamenco tonás using melodic similarity and phylogenetic trees:
And this is a video example of the type of styles we analyze in this paper, done by Nadine Kroher based on her work at the MTG:
You can read the full paper online:
Over the last months, several journal publications related to our research on flamenco & technology are finally online.
One of them is a work with my former PhD student, Nadine Kroher (who now moved to Universidad de Sevilla), on the automatic transcription of flamenco singing. Flamenco singing is really challenging in terms of computational modelling, given its ornamented character and variety, and we have designed a system for its automatic transcription, focusing on polyphonic recordings.
The proposed system outperforms state of the art singing transcription systems with respect to voicing accuracy, onset detection, and overall performance when evaluated on flamenco singing datasets. We hope it think will be a contribution not only to flamenco research but to other singing styles.
You can read about our algorithm at the paper we published at IEEE TASP, where we present the method, strategies for evaluation and comparison with state of the art approaches. You can not only read, but actually try it, as we published an open source software for the algorithm, plus a music dataset for its comparative evaluation, cante2midi (I will talk about flamenco corpus in another post). All of this to foster research reproducibility and motivate people to work on flamenco music.
Publication: Jan. Mar. 2017
Submission deadline: Feb. 1st 2016
Internet access, mobile devices, social networks, and automated multimedia technologies enabling sophisticated information analysis and access have radically changed the ways in which people find entertainment, discover new interests, and generally express themselves online — seemingly without any physical or social barriers. Thanks to the increasing affordability of sensing, storage, and sharing, we note that information takes increasingly rich and hybrid multimedia forms, in which multimodal information streams co-occur in various social consumption settings.
This phenomenon also has enabled opportunities in the music domain. In music performance, novel opportunities for expression are found, exploiting (live) analysis and novel interaction mechanisms with musical data in multiple modalities. In music production, sophisticated multimedia data analysis techniques can both lead to more efficient and scalable workflows, as well as richer and better interfaces. In music consumption, the music data richness and its contextual and social embedding lead to novel consumer experiences stimulating music appreciation. Concerts turn into multimodal, multiperspective, and multilayer digital artifacts that can be easily explored, customized, personalized, (re)enjoyed and shared among various types of users; similar notions and opportunities hold for the consumption of general music recordings.
The goal of this special Issue is to gather state-of-the-art research on multimedia methods and technologies aimed at enriching music performance, production and consumption. We solicit novel, original work that is not published or under review elsewhere.
Topics of interest include, but are not limited to:
- Processing of multimodal music data streams (e.g. audio, video, images, score, text, gesture…) for music performance, production and/or consumption
- Multimedia content description and indexing for music performance, production and/or consumption
- Multimedia information retrieval methods for music performance, production and/or consumption
- Novel interaction mechanisms for music performance, production and/or consumption
- Novel user interfaces for music performance, production and/or consumption
- Novel user experience paradigms for music performance, production and/or consumption
- Social networking and sharing for music performance, production and/or consumption
- Digital mechanisms for remote music performers and audiences
- Active listening, audience immersion, and inclusion of new music audiences
- User-awareness, personalization and intent in music performance, production and/or consumption
- Context-awareness and automatic context adaptation in music performance, production and/or consumption
See www.computer.org/web/peer-review/magazines. Submissions should not exceed 6,500 words, with each table and figure counting for 200 words. Manuscripts should be submitted electronically (https://mc.manuscriptcentral.com/mm-cs), selecting this special issue option.
- Cynthia C. S. Liem, Delft University of Technology
- Emilia Gómez, Universitat Pompeu Fabra
- George Tzanetakis, University of Victoria
Last Wednesday I presented a poster at the Ninth Triennial Conference of the European Society for the Cognitive Sciences of Music (ESCOM 2015), that took place at the Royal Northern College of Music, Manchester, UK. It was a very interesting conference, including a very nice symposium in understanding musical audiences and inspiring talks on music education, psychology and wellbeing. Really impressed by how music have influence to improve quality of live from early years to the end of our lives.
In this study we analysed the emotions that listener perceive when listening to Beethoven Symphony No. 3, Eroica, PHENICX target piece, played by the Royal Concertgebouw Orchestra Amsterdam. We then quantify the correlation between listeners’ perceived emotions from music and 1) musical descriptors, and 2) listeners’ backgrounds (country of origin, musical knowledge, exposure to classical music and knowledge of Eroica).
One conclusion of this study is that tonal strength (i.e. key clarity) correlates significantly with listener ratings of peacefulness, joyful activation, tension and sadness. Other significant correlations between emotion ratings and musical descriptors agree with the literature. This agreed with our hypothesis, being different parts on the same musical piece.
But there are two other unexpected and interesting findings that we might need to continue researching on.
First, we found out that listeners of varying backgrounds agree most on their ratings of sadness, compared to other emotions. Would that be similar for other musical pieces?
Second, listeners of similarly unmusical backgrounds, and listeners of young ages, recognise similar emotions to same music. On the contrary, listeners with more musical experience recognise different emotions to the same music. Caused by personal biases?
Interesting results that might corroborate the need for personalisation in music recommendation engines!