A Discrete Fractional Cosine Transform Based Speech Enhancement System Through Adaptive Kalman Filter Combined with Perceptual Weighting Filter with Pitch Synchronous Analysis

V. R. Balaji(1*), S. Subramanian(2)

(1) Assistant Professor, Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, India
(2) Advisor, Coimbatore Institute of Engineering and Technology, Coimbatore., India
(*) Corresponding author


DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)

Abstract


The speech enhancement plays a vital role commonly used in noisy environment to develop the performance of speech identification in mobile phones or in car navigation system. Thus the quality of the performance of the speech recognition is becoming worse due to the presence of noises in the surrounding. The objective is to increase the evident quality of the speech and to develop the transparency. Signal representation and enhancement in cosine transformation is observed to provide significant results. As an alternative of using DCT, a combination of conventional Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT) which forms the another transform called as the Discrete Fractional Cosine Transform (DFrCT). The DFrCT have a free parameter, its fraction. In order to deal with the issue of frame to frame deviations of the Cosine Transformations, DFrCT is integrated with Pitch Synchronous Analysis (PSA). Also Pitch Synchronous OverLap and Add (PSOLA) method are used to enhance the performance of PSA. Moreover, in order to improve the noise minimization of the system, Improved Iterative Wiener Filtering approach called Adaptive Kalman Filter Combined with Perceptual Weighting Filter is used in this approach. This filter is used to eliminate the matrix operations, reduces both the calculation time and complexity. Thus, a novel DFrCT based speech enhancement using improved iterative filtering algorithm integrated with PSA is used in this approach
Copyright © 2013 Praise Worthy Prize - All rights reserved.

Keywords


Improved Iterative Wiener Filtering; Advanced Discrete Cosine Transform; Pitch Synchronous Analysis; Perceptual Evaluation of Speech Quality

Full Text:

PDF


References


John H. L. and Mark A. Clements, “Constrained Iterative Speech Enhancement with Application to Automatic Speech Recognition”. IEEE, 1988.

Jae S. Lim and Alan, V. Oppenheim, “Enhancement and Bandwidth Compression of Noisy Speech”. Invited paper, IEEE, 1979.

Mosavi, M.R., Ayatollahi, A., Emamgholipour, I., Noise smoothing for GPS receivers positioning data using wavelet transform, (2011) International Review on Modelling and Simulations (IREMOS), 4 (2), pp. 661-667.

Ephraim, Y., Malah, D., (1984). “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator”, IEEE Trans. Acoust. Speech Signal Process. ASSP-32 (6), 1109–1121.

Ye Wang & Miikka Vilermo, (2002). “The Modified Discrete Cosine Transform: Its Implications For Audio Coding And Error Concealment”, AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio.

M. V. Mathews, J. E. Miller, and E. E. D. Jr, “Pitch synchronous analysis of voiced sounds,” J. Acoust. Soc. Amer., vol. 33, p. 179, 1961.

H. M. Ozaktas, O. Arikan, M. A. Kutay, and G. Bozdagi. “Digital computation of the fractional Fourier transform,” IEEE Trans. Sig. Proc., 44:2141-2150, 1996.

C. Candan, M. A. Kutay, and H. M. Ozaktas. “The discrete fractional Fourier transformation,” Proc. of IEEE Int. Conf. On Acous., Speech and Sig. Proc. (ICASSP’99) Phoenix, Arizona, March15-18, 1999.

Huijun Ding, Ing Yann Soon, and Chai Kiat Yeo, (2011).“A DCT-Based Speech Enhancement System With Pitch Synchronous Analysis”, IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 8.

S.C.Shekokar, M. B. Mali, “Speech Enhancement Using DCT”, International Journal of Engineering and Advanced Technology (IJEAT), Vol.2, Issue. 5, 2013.

J.-H. Chang, (2005). “Warped discrete cosine transform-based noisy speech enhancement,” IEEE Trans. Circuits Syst. II: Express Briefs, vol. 52, no. 9, pp. 535–539.

M. F. Erden, M. A. Kutay, and H. M. Ozaktas. “Repeated filtering in consecutive fractional Fourier domains and its application to signal restoration,” .IEEE Trans. On Sig. Proc., 47:1458-1462, 1999.

Kalman, R.E. (1960). "A new approach to linear filtering and prediction problems". Journal of Basic Engineering 82 (1): 35–45. Retrieved 2008-05-03.

XIE Hua. “Adaptive Speech Enhancement Base on Discrere Cosine Transformation in High Noise Environment”. Harbin Engineering University,2006.

Rabiner, L., Cheng, M., Rosenberg, A. & McGonegal, C., (1976). “A comparative performance study of several pitch detection algorithms,” IEEE Trans. Acoust., Speech, Signal Process., vol. 24, no. 5, pp. 399–418.

Wang, S., Sekey, A. & Gersho, A. (1992). “An objective measure for predicting subjective quality of speech coders,” IEEE J. Sel. Areas Commun., vol. 10, no. 5, pp. 819–829.

Ding H. & Soon, I. Y. (2009). “An adaptive time-shift analysis for DCT based speech enhancement,” in Proc. ICICS, 2009, pp. 1–4.

Rix, A. W., Beerends, J. G., Hollier, M. P. & Hekstra, A. P. (2001). “Perceptual evaluation of speech quality (PESQ)—A new method for speech quality assessment of telephone networks and codecs,” in Proc. ICASSP, vol. 2, pp. 749–752.

Hu Y. & Loizou, P. C. (2008). “Evaluation of objective quality measures for speech enhancement,” IEEE Trans. Audio., Speech, Lang. Process., vol. 16, no. 1, pp. 229–238.

D. L. Jones and T. W. Parks, “Generation and combination of grains for music synthesis", Computer Music Journal 12, 27{34 (1988).

Wahid, Seok-Bum Ko, D. Teng, Multiplication-Free Realization of 8-point Cosine Transform for H.264 Applications, (2008) International Review on Modelling and Simulations (IREMOS), 0 (0), pp. 54-63.

H. Maalem, Analysis And Synthesis Models Of Pathological Speech Signal, (2009) International Review on Modelling and Simulations (IREMOS), 2 (1), pp. 113-117.

S. Chehresa, M. H. Savoji, Speech Enhancement Based On Gaussian Mixture Modeling and Wiener Filtering, (2012) International Journal on Communications Antenna and Propagation (IRECAP), 2 (2), pp. 111-122.


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2019 Praise Worthy Prize