Open Access Open Access  Restricted Access Subscription or Fee Access

Conventional Acoustic Features Based Gammachirp Filterbank for Text Independent Speaker Identification System in Noisy Environments

(*) Corresponding author

Authors' affiliations



In this paper we study the performance of a novel feature extraction for both of MFCC and PLP features based Gammachirp filterbank in a state-of-the-art text independent speaker identification system (SID). The novel feature for extracting these parameters is based on an auditory periphery model for robust speaker identification system which mimics the human auditory system characteristics and relies on the Gammachirp filterbank to imitate the cochlea frequency resolution with nonlinear resolution according to the equivalent rectangular bandwidth (ERB) scale. Our evaluations show that the proposed feature performs considerably better than conventional acoustic features. We further demonstrate that integrating the proposed feature in various noisy environments yields promising recognition performance.
Copyright © 2015 Praise Worthy Prize - All rights reserved.


Speaker Identification; Gammachirp; MFCC; PLP; Feature Extraction; HMM/GMM

Full Text:



T. Kinnunen, V. Hautamäki, P. Fränti, Fusion of Spectral Feature Sets for Accurate Speaker Identification, Proceedings of the 9th International Conference on Speech and Computer (SPECOM 2004), pp. 361-365, St. Petersburg, Russia, September 20-22, 2004.

T. Kinnunen, T. Kilpeläinen, P. Fränti, Comparison of Clustering Algorithms in Speaker Identification, Proceedings of the IASTED International Conference on Signal Processing and Communications (SPC 2000), pp. 222-227, Marbella, Spain, September 19-22, 2000.

Varga, A., Steeneken, H,J,M., Omlison, M,T., Jones, D., The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition, Documentation included in the NOISEX-92 CD-ROM Set.,1992 .

Smith III, J, O., Abel, J, S., Bark and ERB Bilinear Transforms. IEEE Tran. On speech and Audio Processing, Vol. 7, No. 6, November 1999.

Irino, T., Patterson, R, D., A time-domain, Level-dependent auditory filter: The gammachirp. J. Acoust. Soc. Am. 101(1): 412-419, January, 1997.

Johannesma, P, I, M., The pre-response stimulus ensemble of neurons in the cochlear nucleus, in Symposium on Hearing Theory (IPO, Eindhoven, Holland), pp. 58–69, 1972.

Irino, T., Unoki, M,. A time-varying, analysis/synthesis auditory filterbank using the gammachirp, IEEE Int. Conf. Acoust., Speech Signal Processing, pp3653-3656. ICASSP 1998.

Hajaiej, Z., Ouni, K., ELLOUZE, N., Isolated Word Recognition by Asymmetric Gammachirp Parameters, Journal of Signal Processing Japan , Volume 11, Number 1, pp 75-81, January 2007.

E.C. Gordon, Signal and Linear System Analysis. Copyright © 1998 John Wiley &Sons Ltd., NewYork, USA.

M. H. hermansky, Perceptual linear predictive(PLP) analysis of speech, journal of the Acoustical Society of America, vol 87 no.4,pp 1738- 1752,1990.

Sid Ahmed Selouani and Jean Caelen, Un système connexionniste modulaire pour la reconnaissance des traits phonétiques de l’Arabe.

F.Z. Chelali, A.Djeradi, R.Djeradi, Speaker Identification System based on PLP Coefficients and Artificial Neural Network, Proceedings of the World Congress on Engineering 2011 Vol II WCE 2011, July 6 - 8, 2011, London, U.K.

T. Irino et M. Unoki. An analysis auditory filterbank based on an IIR implementation of the gammachirp. J. Acoust. Soc Japan. 20(6): 397- 406, November, 1999.

T. Irino, R. D. Patterson. A compressive gammachirp auditory filter for both physiological andpsychophysical data. J. Acoust Soc. Am. 109(5): 2008-2022, may 2001.

T. Irino, R. D. Patterson. Temporal asymmetry in the auditory system. J. Acoust. Soc. Am. 99(4): 2316-2331, April, 1997.

R. D. Patterson, I. Nimmo-Smith. Off-frequency listening and auditory-filter asymmetry, J. Acoust.Soc. Am., Vol. 67, No. 1, pp. 229-245, 1980.

T. Irino, D. Patterson. A time-domain, level dependent auditory filter: the gammachirp. J. Acoust Soc.Am. 101(1): 412-419, January, 1997.

Fletcher, H., Auditory patterns, Rev. Mod. Phys. 12,47–65, 1940.

B.R. Glasberg, B. C. J. Moore. Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47103-198, 1990.

Umesh, S., Cohen, L. and Nelson, D., Fitting the Mel Scale, International Conference on Acoustics, Speech, and Signal Processing, ICASSP'99, vol.1, pp. 217-220, Phoenix Arizona (USA), mars 1999.

Fletcher, H., Auditory Patterns, Review of Modern Physics, 12, pp. 47-65, 1940.

D.A.Reynold, Speaker identification and verification using Gaussian mixture models, Speech Communication, vol.17, pp.91-108, 1995

HTK tutorial book,

Baum, L. E., Eagon, J. A., An Inequality with Applications to Statistical Estimation for Probabilistic Functions of Markov Processes, Inequalities, vol. 3, pp. 1¬8, 1972.

TIMIT: Acoustic-phonetic Continuous Speech Corpus CD-ROM. U.S. Dept. of Commerce, NIST, Gaithersburg, MD, 1993.

S.Young, The HTK Hidden Markov Model Toolkit: Design and Philosophy, Rapport technique, Cambridge university Engineering Departement, 1994.

Varga, A., Steeneken, H,J,M., Omlison, M,T., Jones, D., The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition, Documentation included in the NOISEX-92 CD-ROM Set.,1992.


  • There are currently no refbacks.

Please send any question about this web site to
Copyright © 2005-2024 Praise Worthy Prize