This week, I am attending a focused seminar in Multimodal Music Processing. The organizers managed to gather together an amazing group of researchers from different areas in music technology. We are trying to discuss on the challenges related to the combination of different modalities of information into music processing systems.
What do we mean by multi-modality?
A “Modality” can be defined as “any of the various types of sensation, such as vision or hearing” or even “any of the five senses”. How to take advantage of information from our five senses into music processing systems? If we also consider the “context” and the “user” as an information source, we then have a huge amount of information to be efficiently combined.
This is mainly the challenge of all of the area, dealing and combining data and information for a particular task.
I tried to apply multimodality to my current project in flamenco music and I realized we are facing a multi-modal, multi-disciplinary and multi-cultural problem.
– Multi-modal: we are dealing with the integration of different knowledge sources: music, expression, context, cultural information, anthropological data, listener judgments, text, image and movement.
– Multi-disciplinary: each modality formalizes differently, so how to formalize knowledge from other disciplines into music processing systems: music content processing, knowledge discovery, musicology (flamenco scholars), cognition, anthropology and literature.
– Multi-cultural: How to refine music processing systems to be significant to people from different musical backgrounds and cultures.
So I am gathering nice ideas at this seminar!