SPEECH EMOTION RECOGNITION USING DEEP NEURAL NETWORKS

dc.contributor.authorMustazhapov, Raiymbek
dc.date.accessioned2021-07-28T10:13:26Z
dc.date.available2021-07-28T10:13:26Z
dc.date.issued2021-07
dc.description.abstractThere is an apparent evolving interest in speech emotion recognition (SER), one of the particular cases of a broader problem of multimedia pattern recognition. SER is considered to possess the capability to enhance the communication efficiency between human and artificial intelligence providing an emotional context to the machine. The field has been developing fast with the emergence and increase in accessibility of deep learning techniques recently. This potential critical benefit and novel techniques have drawn the attention of many specialists in the field and generated a great number of research papers that furnish diverse intricate methods. One of such methods involving various data augmentation techniques has demonstrated high performance in this field. This paper performs an analysis of various simple augmentation methods to attempt to improve existing models. Particularly, this research focuses on state of the art CNN models for RAVDESS, EMO-DB, and IEMOCAP datasets, and exploits temporal, spatial, and spectral transformations of sound as an underlying method for augmentation. As a result of exploiting simple augmentations, we achieved an increase in performance for IEMOCAP model and positive effects comparable to their complex counterparts for other datasets.en_US
dc.identifier.citationMustazhapov, R. (2021). Speech Emotion Recognition using Deep Neural Networks (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstanen_US
dc.identifier.urihttp://nur.nu.edu.kz/handle/123456789/5616
dc.language.isoenen_US
dc.publisherNazarbayev University School of Engineering and Digital Sciencesen_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/us/*
dc.subjectSERen_US
dc.subjectspeech emotion recognitionen_US
dc.subjectDeep Neural Networksen_US
dc.subjectDNNen_US
dc.subjectAIen_US
dc.subjectartificial intelligenceen_US
dc.subjectType of access: Open Accessen_US
dc.titleSPEECH EMOTION RECOGNITION USING DEEP NEURAL NETWORKSen_US
dc.typeMaster's thesisen_US
workflow.import.sourcescience

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Thesis - Raiymbek Mustazhapov.pdf
Size:
2.7 MB
Format:
Adobe Portable Document Format
Description:
Thesis
Loading...
Thumbnail Image
Name:
Presentation - Raiymbek Mustazhapov.pptx
Size:
3.14 MB
Format:
Microsoft Powerpoint XML
Description:
Presentation