ENHANCED MULTIMODAL EMOTION RECOGNITION SYSTEM WITH DEEP LEARNING AND HYBRID FUSION
| dc.contributor.author | Baidussenov, Alikhan | |
| dc.contributor.author | Khamiyev, Assylzhan | |
| dc.contributor.author | Sarsengaliyev, Damir | |
| dc.contributor.author | Rymkan, Alisher | |
| dc.contributor.author | Mukhametzhanov, Meiram | |
| dc.date.accessioned | 2025-06-13T10:08:52Z | |
| dc.date.available | 2025-06-13T10:08:52Z | |
| dc.date.issued | 2025-05-25 | |
| dc.description.abstract | This paper presents a multimodal emotion recog- nition (MER) system that combines deep learning and hybrid transformer fusion to analyze emotional expressions from raw user-uploaded videos. The system processes and fuses informa- tion from three modalities—video, audio, and text—to identify six primary emotions: happiness, sadness, anger, frustration, excited, and neutral. Leveraging transformers and pre-trained encoders such as RoBERTa and VGG, our architecture captures intra- and inter-modal dependencies to improve classification performance. The MER model was integrated into a user- friendly web application, enabling real-time emotion inference via a React-based frontend and a Flask backend. Evaluated on the IEMOCAP dataset, the system achieved a validation weighted F1-score of 0.5128 and highlighted strong recognition capabilities for expressive emotions like sadness and anger. This work demonstrates the potential of multimodal deep learning approaches in emotion-aware systems and offers a scalable foundation for future applications in mental health monitoring, human-computer interaction, and affective computing. | |
| dc.identifier.citation | Rymkan, A., Sarsengaliyev, D., Khamiyev, A., Baidussenov, A., & Mukhametzhanov, M. (2024). Enhanced multimodal emotion recognition system with deep learning and hybrid fusion. Nazarbayev University School of Engineering and Digital Sciences. | |
| dc.identifier.uri | https://nur.nu.edu.kz/handle/123456789/8967 | |
| dc.language.iso | en | |
| dc.publisher | Nazarbayev University School of Engineering and Digital Sciences | |
| dc.rights | Attribution 3.0 United States | en |
| dc.rights.uri | http://creativecommons.org/licenses/by/3.0/us/ | |
| dc.subject | Multimodal Emotion Recognition | |
| dc.subject | Deep Learning | |
| dc.subject | Transformer Fusion | |
| dc.subject | RoBERTa | |
| dc.subject | VGG | |
| dc.subject | IEMOCAP Dataset | |
| dc.subject | PyTorch | |
| dc.subject | Whisper | |
| dc.subject | React.js | |
| dc.subject | Emotion Classification. | |
| dc.subject | type of access: embargo | |
| dc.title | ENHANCED MULTIMODAL EMOTION RECOGNITION SYSTEM WITH DEEP LEARNING AND HYBRID FUSION | |
| dc.type | Bachelor's Capstone project |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Senior_Project_Report_nur.pdf
- Size:
- 2.54 MB
- Format:
- Adobe Portable Document Format
- Description:
- Bachelor's Capstone project