Algorithm for Cepstral Analysis and Homomorphic Filtering for Glottal Feature Estimation in Speech Signals

Leonardo A. Gongora; Diego A. Rojas; Olga Lucia Ramos

doi:10.15866/iree.v10i4.6961

Algorithm for Cepstral Analysis and Homomorphic Filtering for Glottal Feature Estimation in Speech Signals

Leonardo A. Gongora⁽¹⁾, Diego A. Rojas⁽²⁾, Olga Lucia Ramos^(3*)

^(*) Corresponding author

Authors' affiliations

DOI: https://doi.org/10.15866/iree.v10i4.6961

Abstract

Cepstral analysis and homomorphic filtering, are widely used tools for speech analysis. This paper presents an algorithm for cepstral analysis of speech signals. Homomorphic speech processing and cepstrum estimation is performed using the fast Fourier transform (FFT), its inverse operation (IFFT) and the logarithmic magnitude, thus the estimation of cepstral parameters is reached. Using these characteristics, the calculation of glottal speech features is completed, using homomorphic filtering to separate the source and filter signals from the vocal tract response. The last part of this paper shows the obtained signals in the processing task and the analysis as a result of the algorithm process.
Copyright © 2015 Praise Worthy Prize - All rights reserved.

Keywords

Complex Cepstrum; Homomorphic Speech Processing; Deconvolution; Glottal Features

Full Text:

PDF

References

G. Ooyama, S. Katagiri, and K. Kido, A new method of Cepstrum analysis by using comb lifter, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1978), April 10-12, 1978, Tulsa, USA.
http://dx.doi.org/10.1109/icassp.1978.1170476

P. Yip, Cepstrum analysis using discrete trigonometric transforms,IEEE Transactions on Signal Processing, vol. 39, n. 2, 1991, pp. 538–541.
http://dx.doi.org/10.1109/78.80852

I. Demirkol and W. Heinzelman, BaNa: A Noise Resilient Fundamental Frequency Detection Algorithm for Speech and Music, IEEE/ACM Transactions on Audio, Speech, Language Processing, vol. 22, n. 12, December 2014, pp. 1833–1848.
http://dx.doi.org/10.1109/taslp.2014.2352453

L. R. Rabiner and R. W. Schafer, Theory and Applications of Digital Speech Processing(Pearson, 2011).

S. Suárez-Guerra and J. L. Oropeza-Rodriguez, Advances in Audio and Speech Signal Processing(IGI Global, 2007).
http://dx.doi.org/10.4018/978-1-59904-132-2.ch011

B. Bogert, M. Healy, and J. Tukey, The quefrency alanysis of time series for echoes: Cepstrum, Pseudo-Autocovariance, Cross-Cepstrum and Saphe Cracking, Proceedings of the Symposium on time series analysis, 1963, pp. 209 – 243.

A. M. Noll, Cepstrum Pitch Determination, The Journal of the Acoustical Society of America., vol. 41, n.. 2, 1967, p. 293.
http://dx.doi.org/10.1121/1.1910339

R. Maia, M. Akamine, and M. J. F. Gales, Complex cepstrum analysis based on the minimum mean squared error, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, May 26-31, 2013, Vancouver, BC.
http://dx.doi.org/10.1109/icassp.2013.6639217

W.-Q. Zhang, L. He, Y. Deng, J. Liu, and M. T. Johnson, Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition,IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, n. 2, Febrary 2011 pp. 266–276.
http://dx.doi.org/10.1109/tasl.2010.2047680

D. P. K. Lun, T.-W. Shen, and K. C. Ho, A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments,IEEE/ACM Transactions on Audio, Speech, and Language Processing., vol. 22, n. 2, Febrary 2014, pp. 335–346.
http://dx.doi.org/10.1109/taslp.2013.2290497

Hong Hong, Zhengmin Zhao, X. Wang, and Zhiyong Tao, Detection of Dynamic Structures of Speech Fundamental Frequency in Tonal Languages,IEEE Signal Processing Letters, vol. 17, n. 10, October 2010,pp. 843–846.
http://dx.doi.org/10.1109/lsp.2010.2058799

D. Y. Loni and S. Subbaraman, Formant estimation of speech and singing voice by combining wavelet with LPC and Cepstrum techniques,The 9th International Conference on Industrial and Information Systems (ICIIS 2014), December 15-17, 2014, Gwalior, India.
http://dx.doi.org/10.1109/iciinfs.2014.7036530

R. Maia, M. Akamine, and M. J. F. Gales, Complex cepstrum for statistical parametric speech synthesis,Speech Communication., vol. 55, n. 5, June 2013, pp. 606–618.
http://dx.doi.org/10.1016/j.specom.2012.12.008

E. Pavez and J. F. Silva, Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition,Speech Communication., vol. 54, n. 6, July 2012, pp. 814–835.
http://dx.doi.org/10.1016/j.specom.2012.02.002

L. F. Brinca, A. P. F. Batista, A. I. Tavares, I. C. Gonçalves, and M. L. Moreno, Use of cepstral analyses for differentiating normal from dysphonic voices: a comparative study of connected speech versus sustained vowel in European Portuguese female speakers., Journal of Voice, vol. 28, n. 3, May 2014, pp. 282–6.
http://dx.doi.org/10.1016/j.jvoice.2013.10.001

Y. D. Heman-Ackah, R. T. Sataloff, G. Laureyns, D. Lurie, D. D. Michael, R. Heuer, A. Rubin, R. Eller, S. Chandran, M. Abaza, K. Lyons, V. Divi, J. Lott, J. Johnson, and J. Hillenbrand, Quantifying the cepstral peak prominence, a measure of dysphonia.,Journal of Voice, vol. 28, n. 6, November 2014, pp. 783–8.
http://dx.doi.org/10.1016/j.jvoice.2014.05.005

B. Radish Kumar, J. S. Bhat, and N. Prasad, Cepstral analysis of voice in persons with vocal nodules.,Journal of Voice, vol. 24, n. 6, November 2010, pp. 651–3.
http://dx.doi.org/10.1016/j.jvoice.2009.07.008

M. D. Skowronski, R. Shrivastav, and E. J. Hunter, Cepstral Peak Sensitivity: A Theoretic Analysis and Comparison of Several Implementations,Journal of Voice, available on line May 2 2015.
http://dx.doi.org/10.1016/j.jvoice.2014.11.005

B. Bozkurt, B. Doval, C. D’Alessandro, and T. Dutoit, Zeros of Z-transform representation with application to source-filter separation in speech, IEEE Signal Processing Letters, vol. 12, n. 4, April 2005, pp. 344–347.
http://dx.doi.org/10.1109/lsp.2005.843770

N. Cummins, S. Scherer, J. Krajewski, S. Schnieder, J. Epps, and T. F. Quatieri, A review of depression and suicide risk assessment using speech analysis,Speech Communication., vol. 71, April 2015, pp. 10–49.
http://dx.doi.org/10.1016/j.specom.2015.03.004

A. E. Rosenberg, Effect of Glottal Pulse Shape on the Quality of Natural Vowels,The Journal of the Acoustical Society of America., vol. 49, n. 2B, 1971, pp. 583–590.
http://dx.doi.org/10.1121/1.1912389

T. Drugman, B. Bozkurt, and T. Dutoit, Causal–anticausal decomposition of speech using complex cepstrum for glottal source estimation, Speech Communication., vol. 53, n 6, July 2011, pp. 855–866.
http://dx.doi.org/10.1016/j.specom.2011.02.004

Audacity, “Free audio editor,” 2015. [Online]. Available: http://web.audacityteam.org/.

Refbacks

There are currently no refbacks.

Username
Password
Remember me