In this paper, we investigate the source-filter-based approach for single-channel speech separation. We incorporate source-driven aspects by multi-pitch estimation in the model-driven method. For multi-pitch estimation, the factorial HMM is utilized. For modeling the vocal tract filters either vector quantization (VQ) or non-negative matrix factorization are considered. For both methods, the final combination of the source and filter model results in an utterance dependent model that finally enables speaker independent source separation. The contributions of the paper are the multi-pitch tracker, the gain estimation for the VQ based method which accounts for different mixing levels, and a fast approximation for the likelihood computation. Additionally, a linear relationship between pitch tracking performance and speech separation performance is shown.
|Journal||IEEE Transactions on Audio Speech and Language Processing|
|Publication status||Published - 2011|