Journal ArticleOpen Access

Automatic Environmental Sound Recognition (AESR) Using Convolutional Neural Network

Authors

Md. Rayhan Ahmed, Towhidul Islam Robin, Ashfaq Ali Shafin

Author Affiliations

Stamford University Bangladesh

Published InInternational Journal of Modern Education and Computer Science

Year2020

Citations40

DOI10.5815/ijmecs.2020.05.04

Abstract

Automatic Environmental Sound Recognition (AESR) is an essential topic in modern research in the field of pattern recognition. We can convert a short audio file of a sound event into a spectrogram image and feed that image to the Convolutional Neural Network (CNN) for processing. Features generated from that image are used for the classification of various environmental sound events such as sea waves, fire cracking, dog barking, lightning, raining, and many more. We have used the log-mel spectrogram auditory feature for training our six-layer stack CNN model. We evaluated the accuracy of our model for classifying the environmental sounds in three publicly available datasets and achieved an accuracy of 92.9% in the urbansound8k dataset, 91.7% accuracy in the ESC-10…

View at Publisher

BORR does not host full-text PDFs. The button above takes you to the original publisher.

Fields & Keywords

Physical Sciences Computer Science Signal Processing Music and Audio Processing Speech and Audio Processing Animal Vocal Communication and Behavior Speech recognition Artificial intelligence Linguistics