Preview

Doklady BGUIR

Advanced search

Environmental sound classification system

Abstract

This paper presents environmental sound classification system and performance comparison on ESC 10 dataset. The feature extraction method includes cochlea and auditory nerve models. Classification model includes classic convolutional neuron network architectures. Experiments based on different architectures of convolutional neural networks and proposed feature extraction method. The model outperforms baseline implementations and achieves results comparable to other state-of-the-art approaches.

About the Author

I. N. Zhuk
Belarusian state university of informatics and radioelectronics
Belarus


References

1. Brian hears: online auditory processing using vectorization over channels / B. Fontaine [et al.] // Front. Neuroinform. 5:9. 2011. doi: 10.3389/fninf.2011.00009.

2. Palaz D., Magimai M. Convolutional Neural Networks-based Continuous Speech Recognition using Raw Speech Signal. Doss, Ronan Collobert. Idiap-RR-18-2014.

3. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI/ D. Povey [et al.] // Proc. Interspeech. 2016. P. 2751-2755.

4. Speaker adaptation of neural network acoustic models using i-vectors / G. Saon [et al.] // in ASRU. 2013. P. 55-59.

5. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin / Dario A. [et al.] // arXiv:1512.02595 [cs.CL]. December 2015.

6. Goodman D.F., Brette R. The Brian simulator // Front. Neurosci. 3,2:192-197. doi: 10.3389/neuro.01.026.2009.

7. Equation-oriented specification of neural models for simulations / Stimberg M. [et al.] // Frontiers Neuroinf. 2014. doi:10.3389/fninf.2014.00006.

8. An auditory-based feature for robust speech recognition / Y. Shao [et al.] // Acoustics, Speech and Signal Processing. April 2009. P. 4625-4628.

9. Automatic Speech Recognition with Neural Spike Trains / M.H. Holmberg [et al.] // Interspeech. Lisbon, Portugal, September 4-8, 2006.

10. Ivanov A.V., Likhachov D.S., Petrovsky A.A. Spiking neuron auditory model for speech processing systems // 9th International Workshop on Systems, Signals and Image Processing IWSSIP. Manchester, United Kingdom, 2002.

11. Gerstner W., Kistler W. Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, 2002.

12. Piczak K.J. ESC: Dataset for Environmental Sound Classification // Proceedings of the 23rd ACM international conference on Multimedia. 2015. P. 1015-1018.

13. A real-time environmental sound recognition system for the Android OS / Pillos A. [et al.] // Detection and Classification of Acoustic Scenes and Events. 2016.

14. Matthew D.Z. ADADELTA: An Adaptive Learning Rate Method. arXiv:1212.5701v1 [cs.LG]. December 2012.

15. Breiman L. Machine Learning // Kluwer Academic Publishers, 45: 5. 2001. doi.org/10.1023/A:1010933404324.


Review

For citations:


Zhuk I.N. Environmental sound classification system. Doklady BGUIR. 2018;(3):54-58. (In Russ.)

Views: 3210


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1729-7648 (Print)
ISSN 2708-0382 (Online)