Hindi Syllable Segmentation Using ZCR and Dual Band Energy Ratio

Rubeena A. Khan; J. S. Chitode

doi:10.15866/irecap.v7i7.13614

Hindi Syllable Segmentation Using ZCR and Dual Band Energy Ratio

Rubeena A. Khan^(1*), J. S. Chitode⁽²⁾

^(*) Corresponding author

Authors' affiliations

DOI: https://doi.org/10.15866/irecap.v7i7.13614

Abstract

In this paper, a technique for Hindi syllable segmentation is proposed. Syllable segmentation boundaries for words are first computed using the zero-crossing-rate of the speech signals. Words comprising of syllables ending with consonants and vowels are considered. The performance of the segmentation using a zero-crossing-rate algorithm can be further improved. The ZCR computed boundaries are optimized by decomposing the signal into low and high-frequency components using wavelet decomposition. A method is proposed which uses the ratio of the high to the low-frequency energy of the decomposed signal to compute the accurate syllable segmentation boundaries along with the ZCR function. The accuracy rate of syllable segmentation thus achieved is 96.02% for syllables ending with stop consonants and vowels.
Copyright © 2017 Praise Worthy Prize - All rights reserved.

Keywords

Speech Synthesis; Syllable Segmentation; STE; TTS; ZCR

Full Text:

PDF

References

S. P. Kishore, and A. W. Black, Unit size in unit selection speech synthesis, Proc. 8th European Conf. Speech Communication and Technology, 2003. pp. 1317–1320.

M. Musfir, K.R. Krishnan, H.A. Murthy, Analysis of fricatives, stop consonants and nasals in the automatic segmentation of speech using the group delay algorithm, Proc. 20th IEEE National Conf. communications (NCC), 2014, pp. 1-6.
http://dx.doi.org/10.1109/ncc.2014.6811364

T. Nagarajan, V. K. Prasad and H.A. Murthy, The minimum phase signal derived from the magnitude spectrum and its applications to speech segmentation. Proc. 6th biennial Conf. Signal Processing and Communications. 2001, pp. 95–101.
http://dx.doi.org/10.1049/el:20030616

R. G. Bachu , S. Kopparthi, B. Adapa and B. D. Barkana, Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. Proc. In American Society for Engineering Education (ASEE) Zone. 2008, pp. 1-7.
http://dx.doi.org/10.1007/978-90-481-3660-5_47

S. Goswami, D. Dutta, P. Deka, D. Sarma and B. Bardoloi , ZCR Based Identification of Voiced, Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise, Proc. International Conf. Computation and Communication Advancement (IC3A)-2013,pg. 134-138
http://dx.doi.org/10.1109/icetacs.2013.6691408

S. Ratsameewichai, N. Theera-Umpon, J. Vilasdechanon, S. Uatrongjit and K. Likit-Anurucks, Thai phoneme segmentation using dual-band energy contour. Proc. Conf. ITC-CSCC. 2002 pp. 111-113.
http://dx.doi.org/10.1109/apccas.2000.913447

A. S. Matthew, U. Jain, R. Bhiksha, and R. M. Stern., Automatic segmentation, classification and clustering of broadcast news audio, Proc. DARPA speech recognition workshop, 1997. pp. 97 – 99.
http://dx.doi.org/10.1109/icassp.2003.1202280

Nakagawa, Seiichi, and H. Yasuhide, A method for continuous speech segmentation using hmm. Proc. 9th IEEE International Conf. Pattern Recognition, 1988, pp. 960–962 vol.2
http://dx.doi.org/10.1109/icpr.1988.28414

Karpagavalli, V., Chandra, E., Stop Consonant-Short Vowel (SCSV) Classification for Tamil Speech Utterances, (2016) International Review on Computers and Software (IRECOS), 11 (2), pp. 151-159.
http://dx.doi.org/10.15866/irecos.v11i2.8481

R. Makowski and R. Hossa, Automatic Speech Signal Segmentation Based on the Innovation Adaptive Filter, International Journal of Applied Mathematics and Computer Science, Vol. 24, n. 2, pp. 259–270,2014.
http://dx.doi.org/10.2478/amcs-2014-0019

J. A. Gomez and M. Calvo, Improvements on Automatic Speech Segmentation at the Phonetic Level, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp 557-564, 2011.
http://dx.doi.org/10.1007/978-3-642-25085-9_66

S. He and H. Zhao, Automatic Syllable Segmentation Algorithm of Chinese Speech Based on MF-DFA, Speech Communication, Vol 92, pp 42-51, 2017.
http://dx.doi.org/10.1016/j.specom.2017.04.003

G. Almpanidis, M. Kotti. and C. Kotropoulos, Robust Detection of Phone Boundaries Using Model Selection Criteria With Few Observations. IEEE Transactions on Audio Speech and Language Processing, Vol-17, pp. 287- 298, 2009.
http://dx.doi.org/10.1109/tasl.2008.2009162

Kothandaraman, M., Pachaiyappan, A., Wavelet Based Adaptive Filtering Algorithms for Acoustic Noise Cancellation, (2014) International Review on Computers and Software (IRECOS), 9 (10), pp. 1675-1681.
http://dx.doi.org/10.15866/irecos.v9i10.4308

C. H. Chen, Signal Processing handbook, Dekker, New York, 1988.

S.P. Kawachale, Some investigations for segmentation in speech synthesis by concatenation for more naturalness with application to text to speech (TTS) for Marathi language, Ph.D. dissertation, Dept. Electronics Eng, Bharati Vidyapeeth deemed university, Pune 2015.
http://dx.doi.org/10.21474/ijar01/3659

M. N. Rao, S. Thomas, T. Nagarajan, and H. A. Murthy, Text-to-speech synthesis using syllable-like units. Proc. National Conf. Communications, IIT, India,2005. pp. 277-280,2005.
http://dx.doi.org/10.1016/j.specom.2005.12.003

Gongora, L., Rojas, D., Ramos, O., Algorithm for Cepstral Analysis and Homomorphic Filtering for Glottal Feature Estimation in Speech Signals, (2015) International Review of Electrical Engineering (IREE), 10 (4), pp. 561-568.
http://dx.doi.org/10.15866/iree.v10i4.6961

Refbacks

There are currently no refbacks.

Username
Password
Remember me