An Effective Tamil Speech Word Recognition Technique with Aid of MFCC and HMM (Hidden Markov Model)

S. Rojathai; M. Venkatesulu

doi:10.15866/irecos.v8i2.3127

An Effective Tamil Speech Word Recognition Technique with Aid of MFCC and HMM (Hidden Markov Model)

^(*) Corresponding author

DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)

Abstract

Simply transcribing the speech without essentially knowing the meaning of the utterance is known as Speech recognition. The two can be combined, however the task described here is merely recognition. In this paper a new speech recognition method is proposed for Tamil language speech word recognition. Here, a HMM based recognition method is utilized. In the proposed method, initially preprocessing is performed to reduce the noise in the input speech signals. Then, MFCC feature vectors are extracted from the preprocessed speech signals and these extracted MFCC features are given to the HMM (Hidden Markov Model). Hidden Markov Model (HMM) is a natural and highly efficient statistical technique for automatic speech recognition. It was tested and proved substantially in a wide variety of applications. The model parameters of the HMM are useful in describing the behavior of the utterance. An HMM is a most dominant toll in the speech recognition process and this provides high accuracy results in the speech word recognition. Based on the input word features, the HMM model recognizes the input words more precisely. A set of input words are utilized in the speech recognition process and the result from this HMM guarantees the healthiness of the proposed technique. The implementation result shows the effectiveness of the proposed recognition method in recognizing the input speech words as well as the achieved improvement in their accuracy measure.
Copyright © 2013 Praise Worthy Prize - All rights reserved.

Keywords

Speech Recognition; Filtering; Gaussian Filter; MFCC Features; HMM

Full Text:

PDF

References

Picone, "Signal modelling technique in speech recognition," In Proceedings of the IEEE, Vol. 81, No.9, pp. 1215-1247, 1993

Leonardo Neumeyer and Mitchel Weintraub, "Probabilistic Optimum Filtering for Robust Speech Recognition", In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, SA, Australia, Vol. 1, pp. I/417 - I/420, 1994

Dennis Norris, James M. McQueen and Anne Cutler, "Merging information in speech recognition: Feedback is never necessary", Behavioral and Brain Sciences, Vol. 23, pp. 299-370, 2000

Sharon Oviatt, Phil Cohen, Lizhong Wu, John Vergo, Lisbeth Duncan, Bernhard Suhm, Josh Bers, Thomas Holzman, Terry Winograd, James Landay, Jim Larson and David Ferro, "Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions", Journal Human-Computer Interaction, Vol. 15, No. 4, pp. 263-322, 2000

Manish P. Kesarka, "Feature Extraction for Speech Recogniton", Seminar Report, Electronic Systems Group, EE. Dept, IIT Bombay, pp. 1-11, 2003

Lukas Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra Goel, Martin Karafiat, Daniel Povey, Ariya Rastrow, Richard C. Rose, Samuel Thomas, "Multilingual Acoustic Modeling for Speech Recognition Based on Subspace Gaussian Mixture Models", In Procedings of IEEE International Conference on Acoustics Speech and Signal Processing, Dallas, TX, pp. 4334 - 4337, 2010

Uma Maheswari, Kabilan and Venkatesh, "A Hybrid model of Neural Network Approach for Speaker independent Word Recognition", International Journal of Computer Theory and Engineering, Vol. 2, No. 6, pp. 912-915, 2010

Yoichi Midorikawa , Yuta Muraoka and Masanori Akita, "Noisy Speech Recognition using Wavelet Transform and Weighting Coefficients for a Specific Level", In Proceedings of 20th International Congress on Acoustics, Sydney, Australia, pp. 1-7, 2010

Kuldeep Kumar and R. K. Aggarwal, "Hindi Speech Recognition System Using HTK", International Journal of Computing and Business Research, Vol. 2, No. 2, 2011

Nitin Trivedi, Vikesh Kumar, Saurabh Singh, Sachin Ahuja and Raman Chadha, "Speech Recognition by Wavelet Analysis", International Journal of Computer Applications, Vol. 15, No. 8, pp. 27-32, 2011

Ong and Ahmad, "Malay Language Speech Recogniser with Hybrid Hidden Markov Model and Artificial Neural Network (HMM/ANN)", International Journal of Information and Education Technology, Vol. 1, No. 2, pp. 114-119, 2011

Pavithra, Chinnasamy, Azha.Periasamy and Muruganand, "Feature Matching By SKPCA with Unsupervised Algorithm and Maximum Probability in Speech Recognition", Journal of Management and Science, Vol. 1, No. 1, pp. 11-15, 2011

Sebastian Fleissner, Xiaoyue Liu and Alex Fang, "Acoustic Classification and Speech Recognition Histories for Adaptable Spoken Language Dialogue Systems", In Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong, August 17-21, pp. 679-682, 2011

Marwan Al-Akaidi, "Introduction to speech processing", Fractal Speech Processing-Cambridge University Press, 2012

Sigappi and Palanivel, "Spoken Word Recognition Strategy for Tamil Language", International Journal of Computer Science Issues, Vol. 9, No. 1, No 3, pp. 227-233, 2012

Vimala and Radha, "A Review on Speech Recognition Challenges and Approaches", World of Computer Science and Information Technology Journal, Vol. 2, No. 1, pp. 1-7, 2012

http://en.wikipedia.org/wiki/Sensitivity_and_specificity

H. Maalem, Analysis And Synthesis Models Of Pathological Speech Signal, (2009) International Review on Modelling and Simulations (IREMOS), 2 (1), pp. 113-117.

Sakka, Kachouri and Samet, Speech Denoinsing and Arabic Speaker Recognition System Using Subband Approach, (2007) International Review on Computers and Software (IRECOS), 2 (3), pp. 264 – 271.

Boukadida, F., Ellouze, N., Modeling arabic prosody for a text-to-speech system, (2009) International Review on Computers and Software (IRECOS), 4 (3), pp. 337-343.

Mnasri, Z., Boukadida, F., Ellouze, N., Modeling segmental duration by statistical learning for an Arabic text-to-speech system, (2009) International Review on Computers and Software (IRECOS), 4 (5), pp. 533-542.

Mohanna, Y., Bazzi, O., Zaiour, A., Alaeddine, A., Georges, S., Slaoui, F., Environmental non-speech sound recognition using hidden markov model. case study: Glass break sounds, (2010) International Review on Computers and Software (IRECOS), 5 (2), pp. 134-144.

Huda, M.N., Hasan, M.M., Hassan, F., Kotwal, M.R.A., Muhammad, G., Rahman, C.M., Articulatory feature extraction for speech recognition using neural network, (2011) International Review on Computers and Software (IRECOS), 6 (1), pp. 25-31.

Refbacks

There are currently no refbacks.

Username
Password
Remember me