Bayesian audio source separation
View Researcher II's Other CodesDisclaimer: “The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).”
Please contact us in case of a broken link from here
Authors | C. Févotte |
Journal/Conference Name | Blind speech separation |
Paper Category | ECE |
Paper Abstract | In this chapter we describe a Bayesian approach to audio source separation. The approach relies on probabilistic modeling of sound sources as (sparse) linear combinations of atoms from a dictionary and Markov chain Monte Carlo (MCMC) inference. Several prior distributions are considered for the source expansion coefficients. We first consider independent and identically distributed (iid) general priors with two choices of distributions. The first one is the Student t, which is a good model for sparsity when the shape parameter has a low value. The second one is a hierarchical mixture distribution; conditionally upon an indicator variable, one coefficient is either set to zero or given a normal distribution, whose variance is in turn given an inverted-Gamma distribution. Then, we consider more audio-specific models where both the identically distributed and independently distributed assumptions are lifted. Using a Modified Discrete Cosine Transform (MDCT) dictionary, a time-frequency orthonormal basis, we describe frequency-dependent structured priors which explicitly model the harmonic structure of sound, using a Markov hierarchical modeling of the expansion coefficients. Separation results are given for a stereophonic recording of 3 sources. |
Date of publication | 2007 |
Code Programming Language | MATLAB |
Comment |