SPEECH EMOTION RECOGNITION USING DEEP NEURAL NETWORKS

Mustazhapov, Raiymbek

NUR Home
→
01.NU Schools
→
School of Engineering and Digital Sciences
→
Theses and Dissertations
→
View Item

Система будет остановлена для регулярного обслуживания. Пожалуйста, сохраните рабочие данные и выйдите из системы.

dc.contributor.author	Mustazhapov, Raiymbek
dc.date.accessioned	2021-07-28T10:13:26Z
dc.date.available	2021-07-28T10:13:26Z
dc.date.issued	2021-07
dc.identifier.citation	Mustazhapov, R. (2021). Speech Emotion Recognition using Deep Neural Networks (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstan	en_US
dc.identifier.uri	http://nur.nu.edu.kz/handle/123456789/5616
dc.description.abstract	There is an apparent evolving interest in speech emotion recognition (SER), one of the particular cases of a broader problem of multimedia pattern recognition. SER is considered to possess the capability to enhance the communication efficiency between human and artificial intelligence providing an emotional context to the machine. The field has been developing fast with the emergence and increase in accessibility of deep learning techniques recently. This potential critical benefit and novel techniques have drawn the attention of many specialists in the field and generated a great number of research papers that furnish diverse intricate methods. One of such methods involving various data augmentation techniques has demonstrated high performance in this field. This paper performs an analysis of various simple augmentation methods to attempt to improve existing models. Particularly, this research focuses on state-ofthe- art CNN models for RAVDESS, EMO-DB, and IEMOCAP datasets, and exploits temporal, spatial, and spectral transformations of sound as an underlying method for augmentation. As a result of exploiting simple augmentations, we achieved an increase in performance for IEMOCAP model and positive effects comparable to their complex counterparts for other datasets.	en_US
dc.language.iso	en	en_US
dc.publisher	Nazarbayev University School of Engineering and Digital Sciences	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/us/	*
dc.subject	SER	en_US
dc.subject	speech emotion recognition	en_US
dc.subject	Deep Neural Networks	en_US
dc.subject	DNN	en_US
dc.subject	AI	en_US
dc.subject	artificial intelligence	en_US
dc.subject	Type of access: Open Access	en_US
dc.title	SPEECH EMOTION RECOGNITION USING DEEP NEURAL NETWORKS	en_US
dc.type	Master's thesis	en_US
workflow.import.source	science