Edema and Nodule Pathological Voice Identification by SVM Classifier on Speech Signal

Asma Belhaj; Aicha Bouzid; Noureddine Ellouze

doi:10.15866/irecos.v10i5.6061

Edema and Nodule Pathological Voice Identification by SVM Classifier on Speech Signal

Asma Belhaj^(1*), Aicha Bouzid⁽²⁾, Noureddine Ellouze⁽³⁾

^(*) Corresponding author

Authors' affiliations

DOI: https://doi.org/10.15866/irecos.v10i5.6061

Abstract

This paper introduces two voicing parameters to describe the speech signal and study their effects on the classification of disordered voices. These parameters are the fundamental frequency and the open quotient. The fundamental frequency is obtained by the voicing speech period and the open quotient is defined as the ratio of the open phase by the pitch period. These open phase and pitch period are determined by the GCI and GOI obtained from the multi-scale product method (MPM) of the speech signal. The classification is operated on two pathological databases MAPACI and MEII by an SVM classifier multi-class one against all. We consider a three-category classification into edema, nodule and normal voices for the female speakers. The effects of these voicing parameters are studied when added to MFCC coefficients, MFCC derivatives, and the energy
Copyright © 2015 Praise Worthy Prize - All rights reserved.

Keywords

Pathological Voices; SVM; MFCC; Open Quotient

Full Text:

PDF

References

J.I. Godino-Llorente, P. Gomez-Vilda, and T. Lee, Analysis and Signal Processing of Oesophageal and Pathological Voices, EURASIP Journal on Advances in Signal Processing, Special Issue on Analysis and Signal Processing of Oesophageal and Pathological Voices, 2009.
http://dx.doi.org/10.1155/2009/283504

J.I. Godino-Llorente, P. Gomez-Vilda, N. Saenz-Lechon, M. Blanco-Velasco, F. Cruz Roldan, and M.A. Ferrer, Discriminative methods for the detection of voice disorders, In: NOLISP 2005 International Conference on Non-Linear Speech Processing, April 2005; Barcelona, Spain.
http://dx.doi.org/10.1007/11613107_19

J.I. Godino-Llorente, R. Fraile, N. Saenz-Lechon, V. Osma-Ruiz, and P. Gomez-Vilda, Automatic Detection of Voice Impairments from Text-Dependent Running Speech using a Discriminative Approach, In: MAVEBA 2007, pp. 25–28.
http://dx.doi.org/10.1016/j.bspc.2009.01.007

M. Hirano, Psycho-Acoustic Evaluation of Voice: GRBAS Scale for Evaluation the Horse Voice, Springer 1981; Berlin, Germany.

K. Marasek, “An Attempt to classify lx signals”, In: EuroSpeech 1995 the 4th European Conference on Speech Communication and Technology; September 1995; Madrid, Spain.

D. Deliyski, High-speed videoendoscopy: recent progress and clinical prospects, In: AQL 2006 the 7th International Conference on Advances in Quantitative Laryngology Voice and Speech Rearch , Groningen University..

J. Demeyer, and B. Gosslin, “Glottis segmentation with a high speed glottography: a new approach”, In Proceedings of Liege Image Days, March 2008; Liege, Belgium .

J.I. Godino-Llorente, P. Gomez Vilda, N. Saenz-Lechon1, M. Blanco-Velasco, F. Cruz-Roldan, and M. Angel Ferrer-Ballester, Support Vector Machines Applied to the Detection of Voice Disorders, Springer-Verlag, Berlin Heidelberg, 2005. pp.219-230.
http://dx.doi.org/10.1007/11613107_19

J. Iagnacio Godino-Llorente , Member, IEEE, P. Gomez Vilda , Member, IEEE, M. Blanco-Velasco, Member, IEEE, Dimensionality Reduction of a Pathological Voice Quality Assesment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters, In: IEEE 2006 Transactions on Biomedical Engeneering, October 2006; 5.
http://dx.doi.org/10.1109/tbme.2006.871883

A.A. Dibazar, T.W. Berger, and S.S. Narayanan, Pathological Voice Assesment, In: IEEE 2006 EMBS 2006; New York.
http://dx.doi.org/10.1109/iembs.2006.259835

J. Benesty, M.M. Sondhi, and Y. Huang, Springer Handbook of Speech Processing, Springer, Berlin, Germany 2008.
http://dx.doi.org/10.1007/978-3-540-49127-9

F. Servin, B. Bozkurt, and T. Dutoit, Hnr extraction in voiced speech oriented towards voice quality analysis, In: EUSIPCO 2005 13th European Signal Processing Conference; September 2005; Antalya, Turkey.

G. De Krom, Spectral correlates of breathiness and roughness for different types of vowel fragments, In:ICSLP 1994 the 3rd International Conference on Spoken Language Processing; September 1994, Japan.

C. D’Alessandro,F. Yegnanarayana, and A. Darsinos, Decompositions , In: ICASSP 1993 the IEEE International Conference on Acoustics, Speech, and Signal; May 1993; Detroit, Mich, USA.

P. Boersma, Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, In: IFA1993 of the Institute of Phonetic Sciences; 1993; Amsterdam.

K. Shama, A. Krishna, and N.U. Cholayya, Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology, EURASIP Journal on Advances in Signal Processing 2007; Article ID 85286, 9.
http://dx.doi.org/10.1155/2007/85286

M. Wester, Automatic classification of voice quality: comparing regression models and hidden markov models, In: VOICEDATA 1998 Symposium on Databases in Voice Quality Research and Education;1998.

R.B. Reilly, R. Moran, and P. Lacy, Voice pathology assessment based on a dialogue system and speech analysis, In: AAAI 2004 Symposium on Dialogue Systems for Health Communication; 2004; pp. 104–109.

R.J. Moran, R.B. Reilly, P. de Chazal, and P.D. Lacy, Telephony-based voice pathology assessment using automated speech analysis, IEEE Transactions on Biomedical Engineering; 2006; 3:468–477.
http://dx.doi.org/10.1109/tbme.2005.869776

Corp KE. Multi-dimensional voice program (mdvp) [computer program]. Tech Rep, Kay Elemetrics Corp, 2008.

Corp K. E. Disordered voice database model (version 1.03). Tech Rep, Massachussets Voice Eye and Ear Infirmary Voice and Speech Lab, 1994.

A. Dibazar, S. Narayanan, A system for automaticdetection of pathological speech, In: 36th Asilomar 2002 Conference on Signals, Systems and Computers Pacific Grove, Calif, USA; November 2002.

J.I. Godino-Llorente, S. Aguilera-Navarro, C. Hernandez- Espinosa, M. Fernandez-Redondo, and P. Gomez-Vilda, On the selection of meaningful speech parameters used by a pathologic/non pathologic voice register classifier, In: EUROSPEECH 1999; Budapest, Hungary; September 1999.

C. Fredouille, G. Pouchoulin, J.F. Bonastre, M. Azzarello, A. Giovanni, and A. Ghio, Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia), In: EuroSpeech 2005, the 9th European Conference on Speech Communication and Technology; Lisbon, Portugal, pp. 149–152; September 2005.

G. Pouchoulin, C. Fredouille, J. Bonastre , A. Ghio, M. Azzarello, and A. Giovanni, Modélisation statistique et informations pertinentes pour la caractérisation des voix pathologiques (dysphonies) , In : JEP 2006 (Journée d’Etudes sur la Parole); 2006.

C. Bishop, Pattern Recognition and Machine Learning, Springer, New York, NY, USA, 2006.

P. Kukharchik, I. Kheidorov, E. Bovbel, and D. Ladeev, Image and signal processing, In: Speech Signal Processing, Based on Wavelets and SVM for Vocal Tract Pathology Detection, Lecture Notes in Computer Science Springer; Berlin, Germany, pp. 192–199, 2008.
http://dx.doi.org/10.1007/978-3-540-69905-7_22

J.I. Godino-Llorente, S. Aguilera-Navarro, C. Hernandez-Espinosa, M. Fernandez-Redondo, and P. Gomez-Vilda, On the selection of meaningful speech parameters used by a pathologic/non pathologic voice register classifier. In: EUROSPEECH 1999 the 6th European Conference on Speech Communication and Technology; September 1999; Budapest.

J.B. Alonso, J. de Leon, I. Alonso, and M.A. Ferrer, Automatic detection of pathologies in the voice by HOS based parameters. EURASIP Journal on Advances in Signal Processing 2001; 4: pp. 275–284.
http://dx.doi.org/10.1155/s1110865701000336

Kay Elemetrics Inc. Voice disorders database, version 1.03[CDROM][Online].Available:http://www.kaypentax.com/Product%20Info/CSL%20Options/4337/4337.htm.

MAPACI P. Voice Disorder Database [Online]. Available: http:// www. Mapaci.com/index-ingles.php

http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

A. Belhaj, A. Bouzid, N. Ellouze, and A. Nait-ali, Paramétrisation des voix pathologiques à partir du MPM et leurs classifications , Quatrièmes journées de phonétique clinique ;2011 ; Strasbourg, France.

A. Belhaj, A. Bouzid, N. Ellouze, and A. Nait-ali, Disordered voice parametrisation using the multi-scale product, In : SETIT 2012; Mars 2012, Sousse, Tunis.

A. Belhaj, A. Bouzid, N. Ellouze, Statistical voicing parameter analysis of pathological signals using the Multi-scale Product and SVM classification, In: ATSIP 2014; Mars 2014; Sousse,Tunis.

S. Chekili, A. Belhaj, A. Bouzid, N. Ellouze, Recognition of pathological voices, In: IEEE 2014 International Multi-Conference on Systems, Signals & Devices, Conference on Communication & Signal Processing; Février 2014; Barcelona, Espagne.
http://dx.doi.org/10.1109/ssd.2014.6808900

V. Vapnik, An overview of statistical learning theory, In: IEEE 1999 Transactions on Neural Networks; September 1999; IEEE. pp. 988-1000.
http://dx.doi.org/10.1109/72.788640

http://asi.insarouen.fr/enseignants/~arakoto/toolbox/index.html.

C.W. Hsu, and C.J. Lin, A comparison of methods for multi-class support vector machine, In: IEEE 2002Transactions on Neural Networks; 2002; IEEE. pp.415-425.

J. Friedman, Another Approach to Polychotomous Classification, Technical report, Department of Statistics, Stanford University, 1996.

Refbacks

There are currently no refbacks.

Username
Password
Remember me