Last year I started to collaborate with my colleague Gloria Haro, from UPF, working on image processing, trying to incorporate audio and image descriptors for music analysis. We had a student who worked for several months on this and we are now opening a PhD position to further advance in the topic.
Anyone interested please apply! This is the official call:
The Music Technology Group (MTG) and the Image Processing Group (GPI) of the Department of Information and Communication Technologies, Universitat Pompeu Fabra in Barcelona are opening a joint PhD fellowship in the topic of “Audio-Visual Approaches for Music Content Description” to start in the Fall of 2015.
Music is a highly multimodal concept, where various types of heterogeneous information are associated to a music piece (audio, musician’s gestures and facial expression, lyrics, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to multimodal music analysis studies (Essid and Richard, 2012).
Research on the complementarity of audio and image description technologies to improve the accuracy and meaningfulness of state of the art music description methods. These methods are the core of content-based music information retrieval tasks.
Several standard tasks could benefit from it:
- Synchronization of audio / video streams
- Audio-visual quality assessment
- Structural analysis and segmentation
- Discovery of repeated themes & sections
- Automatic video mashup generation
- Music similarity computation
- Genre / style classification
- Artist identification
- Emotion (mood) characterization
- Optical music recognition (OMR)
Supervisors: Emilia Gómez (MTG) / Gloria Haro (GPI)
Applicants should have experience in audio and image signal processing, and hold a MSc in a related field (e.g. telecommunications, electrical engineering, mathematics, physics or computer science). Experience in scientific programming (Matlab/Python/C++) and excellent English are essential. Musical background and expertise on multimedia information retrieval are also valuable.
The grant involves teaching assistance (up to 60 h a year), so interest for teaching is also valued.
More information on grant details:
Provisional starting date: November 2015
Interested candidates should send a motivation letter, a CV (preferably with references), and academic transcripts to Prof. Emilia Gómez (email@example.com) and Prof. Gloria Haro (firstname.lastname@example.org) before September 10th. Please include in the subject [PhD Audio-Visual].
They will also have to apply to the PhD program of the DTIC of the UPF.
- S. Essid and G. Richard, “Fusion of Multimodal Information in Music Content Analysis”. in Meinard Müller, Masataka Goto and Markus Schedl (Eds) “Multimodal Music Processing”, Dagstuhl Follow-ups, volume 3, pp. 37-53, ISBN 978-3-939897-37-8, 2012.
- M. Müller, M. Goto and M. Schedl (Eds) “Multimodal Music Processing”, Dagstuhl Follow-ups, volume 3, ISBN 978-3-939897-37-8, 2012.
- A. Schindel & A. Rauber. A (2013). Music Video Information Retrieval Approach to Artist Identification, CMMR.
- Y.W. Wang, Z. L.Z. Liu, & J.C. Huang. (2000). Multimedia content analysis-using both audio and visual clues. IEEE Signal Processing Magazine, 17(November). doi:10.1109/79.888862
- Yue Wu, Tao Mei, Ying-Qing Xu, Nenghai Yu, Shipeng Li, “MoVieUp: Automatic Mobile Video Mashup”, IEEE Transactions on Circuits and Systems for Video Technology, 2015.