MULTIMODAL MACHINE LEARNING FOR EMOTION RECOGNITION
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Engineering and Digital Sciences
Abstract
Emotion recognition has become a popular research area in recent years due to the abundance of useful applications. This technology has been used in a variety of areas, including social media, crowd monitoring, live streaming, and human-robot interaction. Recent approaches to emotion recognition have used neural networks such as transformers, multimodal classification, LSTMs, and convolutional neural networks. Recent research has been facilitated by publicly available datasets, which include videos of persons that have been labeled with the dominant emotion of the given scene. In this work, a multimodal technique is used to classify scenes by emotional expressions from such videos by extracting video frames, audio, and transcribed text.
In this work, we have investigated ways to achieve improved performance and efficiency at each stage of the classification process, where we have focused on developing and refining the preprocessing stages of each data input type. This work has allowed us to achieve 89% accuracy on a commonly-used dataset, using a combination of video, audio and text.
Description
Citation
Kazikhan, M. (2025). Multimodal Machine Learning for Emotion Recognition. Nazarbayev University School of Engineering and Digital Sciences
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States
