MIDI2vec: Learning MIDI Embeddings for Reliable Prediction of Symbolic Music Metadata

Tracking #: 2725-3939

This paper is currently under review
Albert Meroño
Raphael Troncy

Responsible editor: 
Guest Editors DeepL4KGs 2021

Submission type: 
Full Paper
An important problem in large symbolic music collections is the low availability of high quality metadata, which is essential for various information retrieval tasks. Traditionally, systems have addressed this by relying either in costly human annotations or in rule-based systems at limited scale. Recently, embedding strategies have been exploited for representing latent factors in graphs of connected nodes. In this work, we propose MIDI2vec, a new approach for representing MIDI files as vectors based on graph embedding techniques. Our strategy consists of representing the MIDI data as a graph, including the information about tempo, time signature, programs and notes. Next, we run and optimise node2vec for generating embeddings using random walks in the graph. We demonstrate that the resulting vectors can successfully be employed for predicting the musical genre and other metadata such as the composer, the instrument or the movement. In particular, we conduct experiments using those vectors as input to a Feed-Forward Neural Network and we report good comparable accuracy scores in the prediction with respect to other approaches relying purely on symbolic music, avoiding feature engineering and producing highly scalable and reusable models with low dimensionality. Our proposal has real-world applications in automated metadata tagging for symbolic music, for example in digital libraries for musicology, datasets for machine learning, and knowledge graph completion.
Full PDF Version: 
Under Review