An Efficient Speaker Recognition System for Separating the Single Channel Speech Using Frequency Modulation

(*) Corresponding author

Authors' affiliations

DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)


Separating multiple speech signals from single channel sound signal made a huge challenge between researchers. In our paper, we intended a Modulation Frequency technique for Single channel Speech separation. Speech is perceived as having a pitch at the fundamental frequency of vibration of the vocal chords. The combined speech is first decomposed into a two-dimensional time-frequency representation using Short-time Fourier analysis (STFT) with frequency bins. It will pass through a narrow band pass Filter. A cross channel correlation did not use its fundamental frequency directly and while it uses tools to estimate pitch. A frequency bins contain fundamental frequency will correlate with the frequency bins with harmonics and the preset value to be initiated. A standard speaker identification system based on Gaussian Mixture models is of frame basis to identify the interfering speaker. A two parallel signal separation, one based on correlations of modulation frequency and the other based on fundamental frequency alone approaches is obtained. Our proposed method is implemented in MATLAB and verified using various speech signals. The experimental results show that the proposed technique is more efficient to separate the multiple voices from Single Channel Speech Signal.
Copyright © 2013 Praise Worthy Prize - All rights reserved.


Speech Separation; Frequency Modulation; Short-time Fourier Analysis (STFT); Cross Channel Correlation; YIN Algorithm; Autocorrelation Function (ACF)

Full Text:



Guoning Hu and DeLiang Wang, "A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation," IEEE Transactions on Audio and Language Processing, Vol. 18, No. 8, Nov 2010.

Lingyun Guand and Richard M. Stern, "Single-channel speech separation based on modulation frequency,” In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 25- 28, Apr 2008.

Qinghua Huang and Dongmei Wang, ”Single-channel speech separation based on long-short frame associated harmonic model," Digital Signal Processing, Vol. 21, pp. 497-507, 2011.

A. Mahmoodzadeh, H. R. Abutalebi, H. Soltanian-Zadeh and H. Sheikhzadeh, "Single Channel Speech Separation with a Frame-based Pitch Range Estimation Method in Modulation Frequency," In Proc. of the International Symposium on Telecommunications (IST), pp. 609 - 613, 2010.

Pejman Mowlaee, Mads Graesbøll Christensen and Søren Holdt Jensen, "New Results on Single-Channel Speech Separation Using Sinusoidal Modeling," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 5, pp. 1265 - 1277, Jul 2011.

P. Mowlaee, M. G. Christensen, Z. H. Tan and S. H. Jensen, "A MAP criterion for detecting the number of speakers at frame level in model-based single-channel speech separation," In Proc. of the Forty Fourth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 538 - 541, Nov 2010.

Pejman Mowlaee, Mads Graesbøll Christensen and Soren Holdt Jensen, "Improved single-channel speech separation using sinusoidal modeling," In Proc. of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 21 -24, Mar 2010.

P. Mowlaee, R. Saeidi, Z.H. Tan, M.G. Christensen, P. Franti and S. H. Jensen, "Joint single-channel speech separation and speaker identification," In Proc. of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2010.

Mohammad H. Radfar, Richard M. Dansereau and Abolghasem Sayadiyan, "A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation," EURASIP Journal on Audio, Speech, and Music Processing, Vol. 2007, Sep 2006.

Pejman Mowlaee, Abolghasem Sayadiyan and Hamid Sheikhzadeh, "Evaluating single-channel speech separation performance in transform-domain," Journal of Zhejiang University-Science, Computers & Electronics, Vol. 11, No. 3, pp. 160-174, 2010.

Steven J. Rennie, John R. Hershey and Peder A. Olsen, "Single-channel speech separation and recognition using loopy belief propagation," In Proc. of the International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 3845 - 3848, Apr 2009.

Paris Smaragdis, Madhusudana Shashanka and Bhiksha Raj, "A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds," Neural Information Processing Systems, Dec 2010.

Michael Stark, Michael Wohlmayr and Franz Pernkopf, "Source Filter Based Single Channel Speech Separation Using Pitch Information," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 2, pp. 242 - 255, Feb 2011.

Pejman Mowlaee and Abolghasem Sayadiyan, "Performance Evaluation for Transform Domain Model-based Single-channel Speech Separation," In Proc. of the IEEE/ACS International Conference on Computer Systems and Applications, pp. 935 - 942, 2009.

Pejman Mowlaee, Abolghassem Sayadiyan and Mansour Sheikhan, "Optimum Mixture Estimator for single-channel Speech Separation," International Symposium on Telecommunications, 2008.

Ginger S. Stickney, Kaibao Nie, and Fan-Gang Zeng, "Contribution of frequency modulation to speech recognition in noise," Journal of Acoustical Society of America, Vol. 118, No. 4, Oct 2005.

Mathieu Parvaix, Laurent Girin and Jean-Marc Brossier, "A Watermarking-Based Method for Informed Source Separation of Audio Signals with a Single Sensor," IEEE Transactions on Audio, Speech, and Language Processing, May 2010.

Tuomas Virtanen and A. Taylan Cemgil, "Mixtures of Gamma Priors for Non-Negative Matrix Factorization Based Speech Separation," In Proc. of the 8th International Conference on Independent Component Analysis and Signal Separation, 2009.

S. Kirbiz and B. Gunsel, "A Perceptually Enhanced Blind Single-Channel Audio Source Separation by Non-negative Matrix Factorization," In Proc. of 18th European Signal Processing Conference (EUSIPCO), Aug 2010.

Les Atlas and Shihab A. Shamma, "Joint Acoustic and Modulation Frequency," EURASIP Journal on Applied Signal Processing, Vol. 7, pp. 668–675, 2003.

Yevgeni Litvin, Israel Cohen and Dan Chazan, "Separation of speech and music sources from a single-channel mixture using discrete energy separation algorithm," International Workshop on Acoustic Signal Enhancement, 2010.

Maria Markakia and Yannis Stylianou, "Discrimination of Speech from Non-speech in Broadcast News Based on Modulation Frequency Feature," Journal on Speech Communication, Vol. 53, No. 5, pp. 726–735, Jun 2011.

Alain de Cheveigne and Hideki Kawahara, “Yin, a fundamental frequency estimator for speech and music,” Journal of Acoustic Society of America, Vol. 111, pp. 1917–1930, 2002.

Chih-Chia Yao, and Ruo-Wei Hung, "A Hybrid Microphone Array Filter for Speech Enhancement", IRECOS, Vol. 6, No. 5, pp. 640-651, September 2011.

G Logeshwari, and G S Anandha Mala, "A Survey on Single Channel Speech Separation ", Advances in Communication, Network, and Computing, Vol. 108, pp. 387–393, 2012.

Shaobai Zhang, Lei Xu, and Xiefeng Cheng, "Research on Classification Method of Speech Signal Based on DIVA Model", IRECOS, Vol. 7, No. 6, pp. 3372-3378, November 2012.

Steven M. Schimmel, Les E. Atlas and Kaibao Nie, “Feasibility Of Single Channel Speaker Separation Based On Modulation Frequency Analysis “, In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 4, pp. 605-608, 2007.


  • There are currently no refbacks.

Please send any question about this web site to
Copyright © 2005-2024 Praise Worthy Prize