Category Archives: projects

Two new industrial PhD projects

Two PhD students, Blai Meléndez-Català and Andrés Pérez-López, are joining my lab thanks to the industrial doctorate program from AGAUR, which supports collaboration between universities and industrial partners, in this case both from Barcelona. These students will work at the company but come to the lab for some time to interact and collaborate with us.

I will be the main academic supervisor of these projects, which are both linked to our research on audio processing and description, and dealing with large audio datasets and focusing on two particular problems:

  • Music/Speech Detection in Broadcast Media Programs” in collaboration  with  BMAT, in particular with Emilio Molina. Blai Meléndez Català is our PhD fellow, and the goal of this project is to research on the task of audio segmentation and tagging in the context audiovisual recordings.
  • Immersive Audiovisual Production Enhacement based on 3D Audio“, in collaboration with Fundación Eurecat, in particular with the audio-visual technologies group leaded by Adan Garriga. This project is related to 3D audio for virtual reality applications, and Andrés Pérez is a new PhD student that will research on innovative production tools for creative industries.

There is some more info (in catalan or spanish) on the UPF web site.

Leave a comment

Filed under projects

New project on MIR & singing: CASAS

At my lab we are starting a new project where we integrate our expertise in singing voice processing and music information retrieval to generate tools for choir singers.

CASAS (Community-Assisted Singing Analysis and Synthesis) is a project funded by the Ministry of Economy and Competitiveness of the Spanish Government (TIN2015-70816-R), that started in  January 1st 2016 and will end in December 31st 2018.

https://i2.wp.com/mtg.upf.edu/system/files/imagecache/projects_tech_thumbs/projects/Logo.jpgHumans use singing to create identity, express emotion, tell stories, exercise creativity, and connect with each other while singing together. This is demonstrated by the large community of music singers active in choirs and the fact that vocal music makes up an important part of our cultural heritage. Currently, an increasing amount of music resources are becoming digital, and the Web has become an important tool for singers to discover and study music, as a feedback resource and as a way to share their singing performances. The CASAS project has two complementary goals:

  • The first one is to improve state-of-the-art technologies that assist singers in their musical practice. We research on algorithms for singing analysis and synthesis (ex: automatic transcription, description, synthesis, classification and visualization), following a user-centered perspective, and with the goal of making them more robust, scalable and musically meaningful.
  • The second one is to enhance current public-domain vocal music archives and create research data for our target music information retrieval (MIR) tasks. Our project put a special emphasis on choral repertoire in Catalan and Spanish.

We exploit our current methods for Music Information Retrieval and Singing Voice Processing, and we involve a community of singers that use our technologies and provide their evaluations, ground truth data and relevance feedback.

I did my first logo, which is inspired by choirs, audio & “houses”, which is the english translation of “casas”. It will be an amazing project!

Leave a comment

Filed under projects, research

FAST project: Acoustic and semantic technologies for intelligent music production and consumption

Yesterday I arrived from Paris, where I attended, as Advisory Board member, a meeting of the FAST Project (www.semanticaudio.ac.uk).

FAST-IMPACT stands for “Fusing Acoustic and Semantic Technologies for Intelligent Music Production and Consumption” and it is funded by EPRSC, Engineering and Physical Sciences Research Countil, UK with 5,199,944 £ (side note: OMG this is real funding, they should know at the new Spanish Agencia Estatal para la Investigación)

According to their web site, This five-year EPSRC project brings the very latest technologies to bear on the entire recorded music industry, end-to-end, producer to consumer, making the production process more fruitful, the consumption process more engaging, and the delivery and intermediation more automated and robust. It addresses three main premises:

(i) that Semantic Web technologies should be deployed throughout the content value chain from producer to consumer;

(ii) that advanced signal processing should be employed in the content production phases to extract “pure” features of perceptual significance and represent these in standard vocabularies;

(iii) that this combination of semantic technologies and content-derived metadata leads to advantages (and new products and services) at many points in the value chain, from recording studio to end-user (listener) devices and applications.

The project is leaded by Dr Mark Sandler, Queen Mary University of London, and include as project participants University of Nottingham (leaded by Dr. Steve Benford), University of Oxford (leaded by Dr. David Deroure), Abbey Road Studios, BBC R&D, The Internet Archives, Microsoft Research and Audiolaboratories Eerlangen.

The results for this first year are amazing, as it can bee seen on the web, in terms of publication, scientific and technological outcomes but more important, great and inspiring ideas!

I am honoured to be part of the advisory board with such excellent researchers and contribute to the Project as much as I can. Some photos of the meeting:

 

 

Leave a comment

Filed under projects, research, Uncategorized

PhD fellowship on Audio-Visual Approaches for Music Information Retrieval at UPF

Last year I started to collaborate with my colleague Gloria Haro, from UPF, working on image processing, trying to incorporate audio and image descriptors for music analysis. We had a student who worked for several months on this and we are now opening a PhD position to further advance in the topic.

Anyone interested please apply! This is the official call:

——

The Music Technology Group (MTG) and the Image Processing Group (GPI) of the Department of Information and Communication Technologies, Universitat Pompeu Fabra in Barcelona are  opening a joint PhD fellowship in the topic of “Audio-Visual Approaches for Music Content Description”  to start in the Fall of 2015.

Motivation:

Music is a highly multimodal concept, where various types of heterogeneous information are associated to a music piece (audio, musician’s gestures and facial expression, lyrics, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to multimodal music analysis studies (Essid and Richard, 2012).

Goal:

Research on the complementarity of audio and image description technologies to improve the accuracy and meaningfulness of state of the art music description methods. These methods are the core of content-based music information retrieval tasks.

Several standard tasks could benefit from it:

  • Synchronization of audio / video streams
  • Audio-visual quality assessment
  • Structural analysis and segmentation
  • Discovery of repeated themes & sections
  • Automatic video mashup generation
  • Music similarity computation
  • Genre / style classification
  • Artist identification
  • Emotion (mood) characterization
  • Optical music recognition (OMR)

Supervisors: Emilia Gómez (MTG) / Gloria Haro (GPI)

Requirements:

Applicants should have experience in audio and image signal processing, and hold a MSc in a related field (e.g. telecommunications, electrical engineering, mathematics, physics or computer science). Experience in scientific programming (Matlab/Python/C++) and excellent English are essential. Musical background and expertise on multimedia information retrieval are also valuable.

The grant involves teaching assistance (up to 60 h a year), so interest for teaching is also valued.

More information on grant details:

http://www.upf.edu/dtic_doctorate/

http://www.upf.edu/dtic_doctorate/phd_fellowships.html

Provisional starting date: November 2015

Application:

Interested candidates should send a motivation letter, a CV (preferably with references), and academic transcripts to Prof. Emilia Gómez (emilia.gomez@upf.edu) and Prof. Gloria Haro (gloria.haro@upf.edu) before September 10th. Please include in the subject [PhD Audio-Visual].

They will also have to apply to the PhD program of the DTIC of the UPF.

References

  • S. Essid and G. Richard, “Fusion of Multimodal Information in Music Content Analysis”. in Meinard Müller, Masataka Goto and Markus Schedl (Eds) “Multimodal Music Processing”, Dagstuhl Follow-ups, volume 3, pp. 37-53, ISBN 978-3-939897-37-8, 2012.
  • M. Müller, M. Goto and M. Schedl (Eds) “Multimodal Music Processing”, Dagstuhl Follow-ups, volume 3, ISBN 978-3-939897-37-8, 2012.
  • A. Schindel & A. Rauber. A (2013). Music Video Information Retrieval Approach to Artist Identification, CMMR.
  • Y.W. Wang, Z. L.Z. Liu, & J.C. Huang. (2000). Multimedia content analysis-using both audio and visual clues. IEEE Signal Processing Magazine, 17(November). doi:10.1109/79.888862
  • Yue Wu, Tao Mei, Ying-Qing Xu, Nenghai Yu, Shipeng Li, “MoVieUp: Automatic Mobile Video Mashup”, IEEE Transactions on Circuits and Systems for Video Technology, 2015.

Leave a comment

Filed under projects

Paper on “My musical avatar”

We feel ourselves identified with the type of music we like and we sometimes use music to define our personality. I guess one of the questions I ask to any new person I know is “whad kind of music do you listen to?”.

During the last few years, I have been taking part in a research project, where the main goal is to visualize one’s musical preferences, “The Musical Avatar“. The idea behind is to use computational tools to automatically describe your music (in audio format) in terms of melody, instrumentation, rhythm, etc and use this information to build an iconic representation of one’s musical preferences and to recommend you new music. All the system is only based on content description, i.e. on the signal itself and not on information about the music (context) as found on web sites, etc. And it works! 🙂

We finally published a paper describing the technology behind and its scientific evaluation   at Information Processing & Management journal. This is the complete reference:

Dmitry Bogdanov, Martín Haro, Ferdinand Fuhrmann, Anna Xambó, Emilia Gómez, Perfecto Herrera Semantic audio content-based music recommendation and visualization based on user preference examples. Information Processing & Management
Volume 49, Issue 1, pp. 13-33, January 2013

There is much to improve, but you can see my musical avatar below. Can you guess how my favorite music sounds like? You can of course build yours from your last-FM profile here.

Emilia's musical avatar

My automatically generated musical avatar

Highlights

► We propose preference elicitation technique based on explicit preference examples. ► We study audio-based approaches to music recommendation and preference visualization. ► Approaches based on semantics inferred from audio surpass low-level timbre methods. ► Such approaches are close to metadata-based system being suitable for music discovery. ► Proposed visualization captures the core musical preferences of the participants.

Leave a comment

Filed under projects, research