Enhancement of Speech Signals Using Weighted Mask and Neuro-Fuzzy Classifier

Judith Justin; Ila Vennila

doi:10.15866/irecos.v8i11.3588

Enhancement of Speech Signals Using Weighted Mask and Neuro-Fuzzy Classifier

^(*) Corresponding author

DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)

Abstract

In this paper, we present an effective noise suppression technique for enhancement of speech signals using weighted mask and Neuro-fuzzy. Initially, the noisy speech signal is broken down into various time-frequency (TF) units and the features are extracted by finding out the Modified Amplitude Magnitude Spectrogram (MAMS). The signal is then classified into respective classes based on the ratio value between the estimated spectrum value and the original spectrum value using Neuro-Fuzzy Classifier. Subsequently, in the enhancement stage, filtered waveforms are windowed and then weighted by the mask value and summed up to get the enhanced signal. We have used evaluation metrics parameters of PESQ (Perceptual Evaluation of Speech Quality), IS (Itakura–Saito distance) and MARS based composite measures in-order to evaluate the proposed technique. We have taken sound samples under various conditions from two databases. We have compared our proposed technique having different Neuro-fuzzy classifiers with the previous technique (having Bayesian classifier). We have found out that the proposed technique achieves good results. Average values obtained for proposed technique considering all noise categories at -5dB had PESQ score of 0.9738, IS score of 17.558, Mars based composite measure of 3.728. The highest values obtained by the technique for PESQ was 1.76, for IS was 96.83 and for MARS based composite measures was 4.147.
Copyright © 2013 Praise Worthy Prize - All rights reserved.

Keywords

Noisy Speech Signals; Feature Extraction; Modified AMS; Neuro-Fuzzy; Enhancement of Speech Signal

Full Text:

PDF

References

Stuckless, R, "Real-time transliteration of speech into print for hearing impaired students in regular classes", American Annals of the Deaf, Vol. 128, pp. 619-624, 1994.

L.R. Rabiner and B.-H. Juang, "Approaches to Automatic Speech Recognition by Machine", Fundamentals of Speech Recognition, pp. 37-50, 1993.

S. Young, "A review of large-vocabulary continuous-speech recognition", IEEE Signal Processing Magazine, Vol.13, No.5, pp. 45-56, 1996.

M. Benzeghiba, R. De Mori, O. Deroo, S. Dupont, T. Erbes, D. Jouvet, L. Fissore, P. Laface, A. Mertins, C. Ris, R. Rose, V. Tyagi and C. Wellekens, " Automatic Speech Recognition and Speech Variability: A review", Speech Communication, Vol. 49, pp. 763–78, 2007.

Martin Cooke, Phil Green, Ljubomir Josifovski and Ascension Vizinho, "Robust Automatic Speech Recognition With Missing And Unreliable Acoustic Data", Speech Communication, Vol. 34, pp. 267-285, 2001.

Thomas Eisele, Reinhold Haeb-Umbach, Detlev Langmann, "A Comparative Study Of Linear Feature Transformation Techniques For Automatic Speech Recognition", In Proceedings of the Fourth International Conference on Spoken Language, Vol.1, pp. 252 - 255, 1996.

In-Seok Kim, " Automatic Speech Recognition: Reliability and Pedagogical Implications for Teaching Pronunciation", Educational Technology & Society, Vol. 9, No.1, pp.322-334,

Reddy, D.R “Speech Recognition by Machine: A Review”, Proceedings of the IEEE , Vol.64, No. 4, pp. 501 - 531, 2009.

Shigeru Katagiri and Chin-Hui Lee, "A New hybrid algorithm for speech recognition based on HMM segmentation and learningVector quantization" , IEEE Transactions on Audio Speech and Language processing, Vol.1, No.4, pp. 421 - 430, 1993.

Bahl L.R, Brown P.F, De Souza P.V, and Mercer R.L, "Estimating Hidden Markov Model Parameters so as to maximize speech recognition Accuracy", IEEE Transaction on Audio, Speech and Language Processing, Vol.1, No.1, pp. 77 - 83, 1993.

Santosh K.Gaikwad, Bharti W.Gawali and Pravin Yannawar, " A Review on Speech Recognition Technique", International Journal of Computer Applications, Vol.10, No.3, pp. 16-24, 2010.

S. Nakamura and K. Shikano, "Room Acoustics and Reverberation: Impact on Hands-Free Recognition", In Proceedings of European Conference on Speech Communication and Technology, Vol. 5, pp. 2419-2422, 1997.

L. Couvreur, C. Couvreur and C. Ris, "A Corpus-Based Approach for Robust ASR in Re-verberant Environments", In Proceedings of International Conference on Spoken Language Processing (ICSLP), Vol. 1, pp. 397-400, 2000.

Y. Pan and A. Waibel, "The Effects of Room Acoustics on MFCC Speech Parameter", In Proceedings of International Conference on Spoken Language Processing (ICSLP), Vol.4, pp. 129-132, 2000.

Laurent Couvreur and Christophe Couvreur, "Blind Model Selection for Automatic Speech Recognition in Reverberant Environments", Journal of VLSI Signal Processing Systems, Vol.36, No. 3, pp. 189 - 203, 2004.

Jike Chong, Ekaterina Gonina and Kurt Keutzer, "Efficient Automatic Speech Recognition on the GPU", GPU Computing Gems, pp. 1-14, 2010.

Y.Gong, “Speech Recognition in Noisy Environments: A Survey” SpeechCommunication, Vol. 16, pp. 261–291, 1995.

Gibak Kim and Philipos C.Loizou, "Improving Speech Intelligibility In Noise Using A Binary Mask That Is Based On Magnitude Spectrum Constraints", IEEE Signal Processing Letters, Vol.17, No. 12, pp.1010-1013, Dec 2010.

Miao Chen Klaus Zechner, "Computing And Evaluating Syntactic Complexity Features For automated Scoring of Spontaneous Non-Native Speech", In Proceedings of the 49th Annual Meeting Of The Association For Computational Linguistics, Vol. 69, No.5, pp.722–731, June 2011.

Ghania Droua-Hamdani, Sid Ahmed Selouani, Algiers, and Malika Boudraa "Algerian Arabic Speech Database (ALGASD): Corpus Design and Automatic Speech Recognition Application", The Arabian Journal for Science and Engineering, Vol.35, No 2, pp. 157-166, Dec 2010.

Tsuo-Lin Chi, Hsien-Chin Liou And Yuli Yeh "A Study Of Web-Based Oral Activities enhanced By Automatic Speech Recognition For EFL College Learning", Computer Assisted Language Learning ,Vol. 20, No. 3, pp.209–233, July 2007.

Ibrahim Patel and Dr. Y. Srinivas Rao,"Speech Recognition Using HMM With MFCC- An Analysis Using Frequency Spectral Decomposion Technique", An International Journal (SIPIJ), Vol.1, No.2, pp.101-110, Dec 2010.

Bushra Naz, Sabit Rahim and Prof. Suntie,"Audio-Visual Speech Recognition Development Era; From Snakes To Neural Network: A Survey Based Study ", Canadian Journal On Artificial Intelligence, Machine Learning And Pattern Recognition, Vol. 2, No. 1, pp.12-16, Feb 2011.

Dipanwita Paul and Dr. Ranjan Parekh, "Automated Speech Recognition Of Isolated Words Using Neural Networks", International Journal of Engineering Science And Technology (IJEST),Vol. 3 , No. 6, pp.4993-5000, June 2011.

A. El Ghazi, C. Daoui , N. Idrissi , M. Fakir and B. Bouikhalene,"Speech Recognition System Based On Hidden Markov Model Concerning The Moroccan Dialect DARIJA", Global Journal Of Computer Science And Technology, Vol.11, No.15, pp.1-5, Sep 2011.

Alan H. S. Chan, Sio-Iong Ao , “Advances in industrial engineering and operations research” Springer, Page 51, 2008.

Antony W. Rix, Michael P. Hollier, Andries P. Hekstra , and John G. Beerend, “Perceptual evaluation of speech quality (PESQ), and objective method

for end-to-end speech quality assessment of narrowband telephone networks

and speech codecs,” ITU, ITU-T Rec. P. 862, 2000.

Gibak Kim, Yang Lu, Yi Hu, and Philipos C. Loizou “An algorithm that improves speech intelligibility in noise for normal-hearing listeners”, Journal of Acoustical Society of America, Vol.126, No.3,Pp. 1486–1494,2009.

Yi Hu and Philipos C. L “Evaluation of Objective Quality Measures for Speech Enhancement”, IEEE Transactions On Audio, Speech, and Language Processing, Vol. 16, No. 1, pp.229-237, 2008.

Sun CT and Jang JSR, “A neuro-fuzzy classifier and its applications”, Proceedings of IEEE International Conference on Fuzzy Systems, Vol.1, pp.94–98, 1998.

B. Cetişli and A. Barkana, “Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training”, Soft Computing, Vol.14, No.4, pp.365–378, 2010.

B. Cetişli, “Development of an adaptive neuro-fuzzy classifier using linguistic hedges”, Expert Systems with Applications, Vol.37, No.8, pp. 6093-6101, 2010.

B. Cetişli, “The effect of linguistic hedges on feature selection”, Expert Systems with Applications, Vol.37, No.8, pp. 6102-6108, 2010.

Hu, Y. and Loizou, P., “Subjective evaluation and comparison of speech enhancement algorithms,” Speech Communication, Vol. 49, pp.588-601,2007.

Rojathai, S., Venkatesulu, M., An effective tamil speech word recognition technique with aid of MFCC and HMM (Hidden Markov Model), (2013) International Review on Computers and Software (IRECOS), 8 (2), pp. 577-586.

Huda, M.N., Hasan, M.M., Hassan, F., Kotwal, M.R.A., Gazi Md, M.I., Hossain, M.S., Muhammad, G., Inhibition/enhancement network performance evaluation for noise robust ASR, (2010) International Review on Computers and Software (IRECOS), 5 (5), pp. 548-556.

Ben Nasr, M., Saoud, S., Cherif, A., Optimization of MLP using genetic algorithms applied to Arabic speech recognition, (2013) International Review on Computers and Software (IRECOS), 8 (2), pp. 653-659.

Refbacks

There are currently no refbacks.

Username
Password
Remember me