Open Access Open Access  Restricted Access Subscription or Fee Access

Optimization of Subspace Decomposition Applied to Speech Dereverberation


(*) Corresponding author


Authors' affiliations


DOI: https://doi.org/10.15866/irecap.v6i1.7509

Abstract


The disadvantageous effects on speech signal produced by reverberation are problematic, such as in telecommunication and hands-free terminals operating at arms-length from the talker’s lips. In this work, we propose a method to dereverberate monaural speech signal. Two steps are used to construct our method. The first step applied the subspace decomposition analysis. The second step consists to employ an optimization of the sparse component. It consists to recover the dereverberant structure after decomposition of the speech signal into two subspaces. The results improve the effectiveness of our method giving a good performance surpassing other approaches.
Copyright © 2016 Praise Worthy Prize - All rights reserved.

Keywords


Reverberant Speech; Subspace Decomposition; Optimization of Sparse Components

Full Text:

PDF


References


M. J. Hunt, C. Lefebvre, Speaker dependent and independent speech recognition experiments with an auditory model, Proceedings of the International Conf. Acoustics Speech and Signal Process, pp. 215-218, 1988.
http://dx.doi.org/10.1109/icassp.1988.196552

D. Klatt, Prediction from critical-band spectra: A first step, Proceedings of the International Conf. Acoustics Speech and Signal Process, pp. 1278-1281, 1982.
http://dx.doi.org/10.1109/icassp.1982.1171512

Góngora, L.A., Rojas, D.A., Ramos, O.L., Algorithm for cepstral analysis and homomorphic filtering for glottal feature estimation in speech signals, (2015) International Review of Electrical Engineering (IREE), 10 (4), pp. 561-568.
http://dx.doi.org/10.15866/iree.v10i4.6961

Issaoui, H., Bouzid, A., Speech signal enhancement using empirical mode decomposition and adaptive method based on the signal for noise ratio objective evaluation, (2014) International Review on Computers and Software (IRECOS), 9 (8), pp. 1461-1467.
http://dx.doi.org/10.15866/irecos.v9i8.1588

Balaji, V.R., Subramanian, S., A discrete fractional cosine transform based speech enhancement system through Adaptive Kalman filter Combined with perceptual weighting filter with pitch synchronous analysis, (2013) International Review on Computers and Software (IRECOS), 8 (9), pp. 2288-2295.

Y. Ephraim, A. Bayesian, Estimation approach for speech enhancement using hidden Markov models, IEEE Trans. Signal Process., Volume 40, (Issue 4), 1992, Pages 725-735.
http://dx.doi.org/10.1109/78.127947

Y. Ephraim, Gain-adapted hidden Markov models for recognition of clean and noisy speech, IEEE Trans. Signal Process., Volume 40, (Issue 6), 1992, Pages 1303–1316.
http://dx.doi.org/10.1109/78.139237

I. Cohen and B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement, IEEE Signal Process. Lett., Volume 9, (Issue 1), 2002, Pages 12-15.
http://dx.doi.org/10.1109/97.988717

I. Cohen, Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging, IEEE Trans. Speech Audio Process., Volume 11, (Issue 5), 2003, Pages 466-475.
http://dx.doi.org/10.1109/tsa.2003.811544

R. Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio Process., Volume 9, (Issue 5), 2001, Pages 504-512.
http://dx.doi.org/10.1109/89.928915

T. Sreenivas, P. Kirnapure, Codebook constrained wiener filtering for speech enhancement, IEEE Trans. Speech Audio Process., Volume 4, (Issue 5), 1996, Pages 383-389.
http://dx.doi.org/10.1109/89.536932

S. Srinivasan, J. Samuelsson, W. B. Kleijn, Codebook driven short-term predictor parameter estimation for speech enhancement, IEEE Trans. Audio, Speech, Lang. Process., Volume 14, (Issue 1), 2006, Pages 163-176.
http://dx.doi.org/10.1109/tsa.2005.854113

S. Srinivasan, J. Samuelsson, W. B. Kleijn, Codebook-based Bayesian speech enhancement for nonstationary environments, IEEE Trans. Audio, Speech, Lang. Process., Volume 15, (Issue 2), 2007, Pages 441-452.
http://dx.doi.org/10.1109/tasl.2006.881696

T. Rosenkranz, P. Henning, Improving robustness of codebook based noise estimation approaches with delta codebooks, IEEE Trans. Audio, Speech, Lang.Process., Volume 20, (Issue 4), 2012, Pages 1177-1188.
http://dx.doi.org/10.1109/tasl.2011.2172943

H. Sameti, H. Sheikhzadeh, L. Deng, R. L. Brennan, HMM-based strategies for enhancement of speech signals embedded in nonstationary noise, IEEE Trans. Speech Audio Process., Volume 6, (Issue 5), 1998, Pages 445-455.
http://dx.doi.org/10.1109/89.709670

M. Kuropatwinski, W. B. Kleijn, Estimation of the excitation variances of speech and noise AR-models for enhanced speech coding, Proceedings of the International Conf. Acoustics Speech and Signal Process, pp. 669-672, 2001.
http://dx.doi.org/10.1109/icassp.2001.940920

D. Y. Zhao, W. B. Kleijn, On noise gain estimation for HMM-based speech enhancement, Proceedings of the International Conf. Interspeech, pp. 2113–2116, 2005.

D. Y. Zhao and W. B. Kleijn, HMM-based gain modeling for enhancement of speech in noise, IEEE Trans. Audio, Speech, Lang. Process., Volume 15, (Issue 3), 2007, Pages 882-892.
http://dx.doi.org/10.1109/tasl.2006.885256

D. Y. Zhao, W. B. Kleijn, Y. Alexander, Online noise estimation using stochastic-gain HMM for speech enhancement, IEEE Trans. Audio, Speech, Lang. Process., Volume 16, (Issue 4), 2008, Pages 835-846.
http://dx.doi.org/10.1109/tasl.2008.916055

Z. Lin, M. Chen, L. Wu, The augmented Lagrange multiplier method for exact recovery of corrupted lowrank matrices, (UILU-ENG-09-2215, 2009)

E. J. Candès, X. Li, Y. Ma, Robust principal component analysis?’, J. ACM., Volume 58, 2011, Pages 1-38.
http://dx.doi.org/10.1145/1970392.1970395

D. M. Witten, R. Tibshirani, T. Hastie, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, Volume 10, 2009, Pages 515-534.
http://dx.doi.org/10.1093/biostatistics/kxp008

F. Bach, Consistency of the group Lasso and multiple kernel learning, J. Mach. Learn. Res., 2008, 9, pp. 1179–1225.

R. Jenatton, G. Obozinski, F. Bach, Structured sparse principal component analysis, Proceedings of the International Conf. on Artificial Intelligence and Stat., 2010.

R. Jenatton, J. Y. Audibert, F. Bach, Structured variable selection with sparsity-inducing norms, J. of Machine Learning Research, Volume 12, 2011, Pages 2777-2824.

G. Obozinski, B. Taskar, M. Jordan, Joint covariate selection and joint subspace selection for multiple classification problems, Stat. Comput., Volume 20, 2009, Pages 231-252.
http://dx.doi.org/10.1007/s11222-008-9111-x

W. M. Fisher, G. R. Doddington, K. M. Goudie-Marshall, The DARPA speech recognition research database: specifications and status,” Proceedings of the International Conf. DARPA Speech Recognition Workshop, pp. 93–99, 1986.

M. Wu, D. L. Wang, A one-microphone algorithm for reverberant speech enhancement, Proceedings of the International Conf. Acoustics Speech and Signal Process, pp. 844-847, 2003.
http://dx.doi.org/10.1109/icassp.2003.1198925

B. Yegnanarayana, P. S. Murthy, Enhancement of reverberant speech using LP residual signal, IEEE Trans. Speech Audio Process., Volume 8, (Issue 3), 2000, Pages 267-281.
http://dx.doi.org/10.1109/89.841209


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2024 Praise Worthy Prize