Low-Latency Instrument Separation in Polyphonic Audio Using Timbre Models

TitleLow-Latency Instrument Separation in Polyphonic Audio Using Timbre Models
Publication TypeConference Paper
Year of Publication2012
Conference NameLatent Variable Analysis and Signal Separation - 10th International Conference, LVA/ICA
AuthorsMarxer, R., Janer J., & Bonada, J.
Pagination314-321
Conference Start Date12/03/2012
PublisherSpringer Berlin / Heidelberg
Conference LocationTel Aviv, Israel
ISBN Number978-3-642-28550-9
KeywordsPredominant pitch tracking, singing voice, source separation
AbstractThis research focuses on the removal of the singing voice in polyphonic audio recordings under real-time constraints. It is based on time-frequency binary masks resulting from the combination of azimuth, phase difference and absolute frequency spectral bin classification and harmonic-derived masks. For the harmonic-derived masks, a pitch likelihood estimation technique based on Tikhonov regularization is proposed. A method for target instrument pitch tracking makes use of supervised timbre models. This approach runs in real-time on off-the-shelf computers with latency below 250ms. The method was compared to a state of the art Non-negative Matrix Factorization (NMF) offline technique and to the ideal binary mask separation. For the evaluation we used a dataset of multi-track versions of professional audio recordings.
Published documentfiles/publications/lowlatency.pdf
intranet