Journal ArticleUnknown
A Deep Learning Approach to Speech Emotion Recognition in Bengali Code-Mixed Speech
Authors
Author Affiliations
Shahjalal University of Science and Technology, Bangladesh Air Force Shaheen College
Year2025
Abstract
The expansion of multilingual and code-switched communication in South Asia necessitates the development of Speech Emotion Recognition (SER) systems capable of handling complex acoustic-linguistic diversity. This study has collected 2,500 real-world recordings that serve as the foundation for this study’s meticulously developed SER framework for BanglaEnglish mixed speech. 14,562 samples that were appropriate for computational modeling were produced after the audio data underwent extensive preprocessing, including standardization, vocal separation, noise reduction, spectrogram synthesis, and consistent temporal segmentation. Data augmentation was used to enhance class-wise balance, and an expert-driven annotation methodology utilizing a majority-agreement strategy was used to assign emotion labels across five categories. Using MFCC representations, a hybrid CNN-BiLSTM architecture was created to simultaneously capture long-range temporal relationships and fine-grained…
View at Publisher
BORR does not host full-text PDFs. The button above takes you to the original publisher.