Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm

Ashty Mahde Aaref; Zuhair Shakor Mahmood

doi:10.15866/irecap.v11i4.19883

Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm

Ashty Mahde Aaref^(1*), Zuhair Shakor Mahmood⁽²⁾

^(*) Corresponding author

Authors' affiliations

DOI: https://doi.org/10.15866/irecap.v11i4.19883

Abstract

Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.
Copyright © 2021 Praise Worthy Prize - All rights reserved.

Keywords

PSO; FFNN; MSE; MFCC

Full Text:

PDF

References

M. Abou-Zleikha, Z. Tan, M. G. Christensen and S. H. Jensen, A discriminative approach for speaker selection in speaker de-identification systems, 2015 23rd European Signal Processing Conference (EUSIPCO), 2015, pp. 2102-2106.
https://doi.org/10.1109/EUSIPCO.2015.7362755

X. Fan and J. H. L. Hansen, Speaker identification with whispered speech based on modified LFCC parameters and feature mapping, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, pp. 4553-4556.
https://doi.org/10.1109/ICASSP.2009.4960643

B. Wang, J. Zhao, X. Peng and B. Li, A Novel Speaker Clustering Algorithm in Speaker Recognition System, 2006 International Conference on Machine Learning and Cybernetics, 2006, pp. 3298-3302.
https://doi.org/10.1109/ICMLC.2006.258463

B. G. Nagaraja and H. S. Jayanna, Efficient window for monolingual and crosslingual speaker identification using MFCC, 2013 International Conference on Advanced Computing and Communication Systems, 2013, pp. 1-4.
https://doi.org/10.1109/ICACCS.2013.6938702

F. R. Chowdhury, S. Selouani and D. O'Shaughnessy, Distributed automatic text-independent speaker identification using GMM-UBM speaker models, 2009 Canadian Conference on Electrical and Computer Engineering, 2009, pp. 372-375.
https://doi.org/10.1109/CCECE.2009.5090157

E. B. Tazi and N. El Makhfi, An hybrid front-end for robust speaker identification under noisy conditions, 2017 Intelligent Systems Conference (IntelliSys), 2017, pp. 764-768.
https://doi.org/10.1109/IntelliSys.2017.8324215

R. Martsyshyn, M. Medykovskyy, L. Sikora, Y. Miyushkovych, N. Lysa and B. Yakymchuk, Technology of speaker recognition of multimodal interfaces automated systems under stress, 2013 12th International Conference on the Experience of Designing and Application of CAD Systems in Microelectronics (CADSM), 2013, pp. 447-448.

V. M. Sardar and S. D. Shrbahadurkar, Speaker identification with whispered speech mode using MFCC: Challenges to whispered speech identification, 2015 International Conference on Information Processing (ICIP), 2015, pp. 70-74.
https://doi.org/10.1109/INFOP.2015.7489353

A. Maazouzi, N. Aqili, A. Aamoud, M. Raji and A. Hammouch, MFCC and similarity measurements for speaker identification systems, 2017 International Conference on Electrical and Information Technologies (ICEIT), 2017, pp. 1-4.
https://doi.org/10.1109/EITech.2017.8255301

Y. Chao, Speaker identification using pairwise log-likelihood ratio measures, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, 2012, pp. 1248-1251.
https://doi.org/10.1109/FSKD.2012.6234345

Al-Qaderi, M., Lahamer, E., & Rad, A. (2021). A Two-Level Speaker Identification System via Fusion of Heterogeneous Classifiers and Complementary Feature Cooperation. Sensors, 21(15), 5097.‏
https://doi.org/10.3390/s21155097

Russo, M., Stella, M., Sikora, M., & Pekić, V. (2019). Robust cochlear-model-based speech recognition. Computers, 8(1), 5.
https://doi.org/10.3390/computers8010005

Mahmood, Z., Nasret, A., Awed, A., Design of New Multiband Slot Antennas for Wi-Fi Devices, (2019) International Journal on Communications Antenna and Propagation (IRECAP), 9 (5), pp. 334-342.
https://doi.org/10.15866/irecap.v9i5.16754

Nasret, A., Mahmood, Z., Optimization and Integration of RFID Navigation System by Using Different Location Algorithms, (2019) International Review of Electrical Engineering (IREE), 14 (4), pp. 291-301.
https://doi.org/10.15866/iree.v14i4.16684

Z. S. Mahmood, A. N. N. Coran and A. Y. Aewayd, "The Impact of Relay Node Deployment In Vehicle Ad Hoc Network: Reachability Enhancement Approach, 2019 Global Conference for Advancement in Technology (GCAT), Bangaluru, India, 2019, pp. 1-3.
https://doi.org/10.1109/GCAT47503.2019.8978445

Berkani, S., Lamhene, Y., Hadj-Sadok, M., Baudrand, H., Metamaterial Properties Applied in Wire Antenna Design, (2019) International Journal on Communications Antenna and Propagation (IRECAP), 9 (5), pp. 327-333.
https://doi.org/10.15866/irecap.v9i5.16280

Al-Muttairi, A., Farhan, M., Circular Polarization Reconfigurable Antenna Based on Defect Ground Structure for Mid-Band 5G Applications, (2020) International Journal on Communications Antenna and Propagation (IRECAP), 10 (2), pp. 114-121.
https://doi.org/10.15866/irecap.v10i2.18427

Alrawashdeh, R., Alhiyari, M., Investigations on Patch Antennas Based on Complementary Split Rings for On-Body Applications, (2020) International Journal on Communications Antenna and Propagation (IRECAP), 10 (2), pp. 94-101.
https://doi.org/10.15866/irecap.v10i2.18532

Mounsef, A., Tabakh, I., El Idrissi, N., Design and Simulation of a 1×2 Dual Band Array Antenna for Medical Monitoring Systems, (2019) International Journal on Communications Antenna and Propagation (IRECAP), 9 (4), pp. 247-254.
https://doi.org/10.15866/irecap.v9i4.15943

ALja'afreh, S., Khalfalla, A., Omar, A., Universal Antenna with a Small Non-Ground Portion for Smartphone Applications, (2019) International Journal on Communications Antenna and Propagation (IRECAP), 9 (4), pp. 292-300.
https://doi.org/10.15866/irecap.v9i4.17082

Deshpande, P., Chitode, J., Spectral Correlative Mapping Approach for Transformation of Expressivity in Marathi Speech, (2018) International Journal on Communications Antenna and Propagation (IRECAP), 8 (1), pp. 41-51.
https://doi.org/10.15866/irecap.v8i1.13895

Ben Messaoud, M., Bouzid, A., Optimization of Subspace Decomposition Applied to Speech Dereverberation, (2016) International Journal on Communications Antenna and Propagation (IRECAP), 6 (1), pp. 1-5.
https://doi.org/10.15866/irecap.v6i1.7509

Marrugo Cardenas, N., Amaya Hurtado, D., Ramos Sandoval, O., Comparison of Multi-Class Methods of Features Extraction and Classification to Recognize EEGs Related with the Imagination of Two Vowels, (2018) International Journal on Communications Antenna and Propagation (IRECAP), 8 (5), pp. 398-405.
https://doi.org/10.15866/irecap.v8i5.12709

Mishra, Sudhanshu, Shivangi Prasad, and Shubhanshu Mishra. Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media. SN Computer Science 2.2 (2021): 1-19.‏
https://doi.org/10.1007/s42979-021-00455-5

Mukherjee, H., Dhar, A., Obaidullah, S. M., Phadikar, S., & Roy, K. (2020). Image-based features for speech signal classification. Multimedia Tools and Applications, 79(47), 34913-34929.‏
https://doi.org/10.1007/s11042-019-08553-6

Kong, Y., Posada-Quintero, H. F., Daley, M. S., Bolkhovsky, J., & Chon, K. H. (2021). Machine-Learning-Based Closed-Set Text-Independent Speaker Identification Using Speech Recorded During 25 Hours of Prolonged Wakefulness. IEEE Access, 9, 96890-96897.‏
https://doi.org/10.1109/ACCESS.2021.3094175

Reddy, M. K., Helkkula, P., Keerthana, Y. M., Kaitue, K., Minkkinen, M., Tolppanen, H., ... & Alku, P. (2021). The automatic detection of heart failure using speech signals. Computer Speech & Language, 69, 101205.
https://doi.org/10.1016/j.csl.2021.101205

Refbacks

There are currently no refbacks.

Username
Password
Remember me