DSpace Repository

MULTIMODAL EMOTION RECOGNITION WITH DEEP LEARNING AND FUSION MECHANISM

Show simple item record

dc.contributor.author Myrzakhmet, Ayan
dc.contributor.author Kanafin, Ali
dc.contributor.author Kuanysh, Zhaksylyk
dc.date.accessioned 2024-06-15T06:27:15Z
dc.date.available 2024-06-15T06:27:15Z
dc.date.issued 2024-04-21
dc.identifier.citation Myrzakhmet, A., Kanafin, A. & Kuasyh, Z. (2024). Multimodal Emotion Recognition with Deep Learning and Fusion Mechanism. Nazarbayev University School of Engineering and Digital Sciences en_US
dc.identifier.uri http://nur.nu.edu.kz/handle/123456789/7872
dc.description.abstract The increasing interest in multimodal techniques for emotion recognition has led to the incorporation of both physiological and non-physiological data. This study aims to develop an advanced emotion recognition model that combines electroencephalography (EEG), video, and audio data using a hybrid fusion strategy to enhance system performance. We investigated various fusion methods, including diverse algorithmic strategies and intermediate attention, which significantly boosted emotion recognition accuracy. Our comprehensive audio-video-EEG model, equipped with intermediate attention and hybrid multimodal fusion techniques, was successfully tested on the RAVDESS dataset and further tailored to a new dataset at Nazarbayev University's Multimedia Lab, which includes video, audio speech, and EEG data. This thorough integration effectively tackled multimodal signal fusion challenges, improving both model sophistication and accuracy. Experimental findings indicate that integrating physiological and non-physiological data substantially enhances the accuracy of emotion detection systems. With our tailored dataset, the top-performing model achieved an accuracy of 68\% for 21-second inputs and 57\% for 3.6-second inputs across five categories. Despite challenges such as variable temporal frames and data segmentation, the results highlight the potent effectiveness of multimodal fusion in advancing emotion recognition. This research not only paves the way for future advancements but also emphasizes the need for ongoing optimization and further exploration of signal integration techniques. en_US
dc.language.iso en en_US
dc.publisher Nazarbayev University School of Engineering and Digital Sciences en_US
dc.rights Attribution-NonCommercial-ShareAlike 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/us/ *
dc.subject Type of access: Open access en_US
dc.title MULTIMODAL EMOTION RECOGNITION WITH DEEP LEARNING AND FUSION MECHANISM en_US
dc.type Bachelor's thesis en_US
workflow.import.source science


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States