Voice Detection Using Convolutional Neural Network

U. A. Vishniakou; B. H. Shaya

doi:10.35596/1729-7648-2023-21-2-114-120

Voice Detection Using Convolutional Neural Network

U. A. Vishniakou, B. H. Shaya

https://doi.org/10.35596/1729-7648-2023-21-2-114-120

Full Text:

PDF (Eng)

Generate QR code

Abstract

The article presents an approach, methodology, the software system based on a machine learning technologies for convolutional neural network and its use for voice (cough) recognition. Tasks of article are receiving evaluating a voice detection system with deep learning, the use of a convolutional neural network and Python language for patients with cough. The convolutional neural network has been developed, trained and tested using various datasets and Python libraries. Unlike the existing modern works related to this area, proposed system was evaluated using a real set of environmental sound data, and not only on filtered or separated voice audio tracks. The final compiled model showed a relatively high average accuracy of 85.37 %. Thus, the system is able to detect the sound of a voice in a crowded public place, and there is no need for a sound separation phase for pre-processing, as other modern systems require. Several volunteers recorded their voice sounds using microphones of their smartphones, and it was guaranteed that they would test their voices in public places to make noise, in addition to some audio files that were uploaded online. The results showed an average recognition accuracy – of 85.37 %, a minimum accuracy – of 78.8 % and a record – of 91.9 %.

Keywords

voice detection, convolution neural network, machine learning-based dataset, audio files

About the Authors

U. A. Vishniakou

Belarusian State University of Informatics and Radioelectronics
Belarus

Vishniakou Uladzimir Anatolievich, Dr. of Sci. (Eng.), Professor at the Department of Infocommunication Technologies

220013, Minsk, P. Brovki St., 6

Tel.: +375 44 486-71-82

B. H. Shaya

Belarusian State University of Informatics and Radioelectronics
Belarus

Shaya Bahaa H., Postgraduate at the Department of Infocommunication Technologies

Minsk

References

1. Shakel N. V., Ablameyko M. S. (2020) Medical Worker and Patient: Interaction in the Context of E-Health. Minsk, Eco-Perspective Publ. (in Russian).

2. Alqudaihi K. S., Aslam N., Khan I. U. [et al.] (2021) Cough Sound Detection and Diagnosis Using Artificial Intelligence Techniques: Challenges and Opportunities. IEEE Public Health Emergency Collection. 9, 102327–102344.

3. Amoh J., Odame K. (2016) Deep Neural Networks for Identifying Cough Sounds. IEEE Transactions on Biomedical Circuits and Systems. 10 (5), 1003–1011.

4. Gong Y., Lai C.-I. J., Chung Y.-A., Glass J. (2021) SSAST: Self-Supervised Audio Spectrogram Transformer. Applied Science. 570–575.

5. Nanni L., Maguolo G., Brahnam S., Paci M. (2021) An Ensemble of Convolutional Neural Networks forAudio Classification. Applied Science. 57–76.

6. Chowdhury A., Ross A. (2019) Fusing MFCC and LPC Features using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals. IEEE Transactions on Information Forensics and Security. 15, 1616–1629.

7. Visniakou U. A., Shaya B. H. (2022) Implementation of the Internet of Things Network for Monitoring Audio Information on a Microprocessor and Controller. System Analysis and Application Informatics. (1), 39–44.

Review

For citations:

Vishniakou U.A., Shaya B.H. Voice Detection Using Convolutional Neural Network. Doklady BGUIR. 2023;21(2):114-120. https://doi.org/10.35596/1729-7648-2023-21-2-114-120

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1729-7648 (Print)
ISSN 2708-0382 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Doklady BGUIR

Voice Detection Using Convolutional Neural Network

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy