jackman.stanford.edu/blog
observations on politics, statistics, computing...

Bayesian analysis…of music

Monday October 26, 2009

Filed under: computing, statistics — jackman @ 9:48 pm

Into Bayes? Into music? Into wicked coding? Want to live in France for a couple of years? Read on below the fold.

The METISS Team at INRIA Rennes – Bretagne Atlantique, France, is offering a 2-year postdoc position on Bayesian modeling and inference of latent source features within audio signals for music information retrieval in the framework of the Quaero project (see details below). To apply, please send a CV and a motivation letter to emmanuel.vincent@irisa.fr and frederic.bimbot@irisa.fr before November 15, 2009.

Polyphonic sound description for music information retrieval

Project-team: METISS
Principal investigator: Emmanuel Vincent (emmanuel.vincent@irisa.fr)
Co-principal investigator: Frédéric Bimbot (frederic.bimbot@irisa.fr)

Description of the project:

The success of online music stores and radios (such as ITunes and LastFM) has led to a quest for smart content distribution services. The development of advanced music information retrieval algorithms would be a major breakthrough in this context, allowing for instance accurate tagging, style classification or retrieval by similarity of audio clips.

Most music information retrieval algorithms rely on global low-level features, such as Mel-Frequency Cepstral Coefficients (MFCCs) or Pitch Class Profiles (PCPs), which model all instruments together. These algorithms are known to exhibit limited performance for a range of classification tasks. A few authors have shown that much higher performance can be achieved by extracting higher-level features for each instrument separately. Yet, attempts to derive such features from polyphonic music signals have failed, due to the low accuracy of source separation and polyphonic music transcription systems. This low accuracy cannot be overcome due to the inherent uncertainty about individual instrument sounds when several instrument sounds mask each other, e.g. when a strong drum sound masks a weaker sound or when a strong pitched note masks a weaker note at an octave interval.

The purpose of this project is to extract features describing individual instruments or sets of instruments from a musical audio signal, without attempting to perfectly separate or transcribe them. Instead, a suitable probabilistic framework will be seeked so as to express the uncertainty about each instrument signal and propagate it through the feature extraction stage, thus making the features more robust to source separation or polyphonic music transcription errors. This will also make it possible to express the uncertainty about the features themselves and disambiguate it at a later stage using higher-level information or feedback from classification. Candidate tools to be used in this context include but are not limited to: harmonic models, GMMs, NMF, BSS, importance sampling and alternative approximate Bayesian inference techniques.

The proposed framework will be primarily applied and validated for the description and the classification of musical audio within large databases. Target instruments to be separately characterized in this context include for instance singing voice, bass and drums. This will naturally lead to more accurate music classification capabilities, such as composer characterization, singer identification and retrieval by rhythmic similarity. More generally, this research will have a huge impact on the music and audio information retrieval community by providing a new general paradigm for research that extends the current paradigm based on deterministic features.

Duration and funding:

This post-doctoral position is provided for one year and renewable for a second year. It will be funded by the QUAERO project (www.quaero.org) and will involve close collaboration with academic and industrial partners.

Candidate profile:

Prospective candidates should have a background in signal processing, machine learning, applied statistics or pattern recognition. Additional expertise in the fields of audio and music is welcome. Proficient programming in Matlab is necessary. Practice of C would be an asset.


Emmanuel Vincent
METISS Project-Team
INRIA Rennes – Bretagne Atlantique
Campus de Beaulieu, 35042 Rennes cedex, France
Phone: +332 9984 2269 – Fax: +332 9984 7171
Web: http://www.irisa.fr/metiss/members/evincent/

Comments (1)

1 Comment »

  1. Ah, the path not taken.

    Comment by Steve Haptonstahl — Wednesday October 28, 2009 @ 8:52 am

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress

Bad Behavior has blocked 3543 access attempts in the last 7 days.