REAL-TIME SPEECH EMOTION RECOGNITION (RSER) IN WIRELESS MULTIMEDIA SENSOR NETWORKS

Zhilibayev, Serik

NUR Home
→
01.NU Schools
→
School of Engineering and Digital Sciences
→
Theses and Dissertations
→
View Item

Система будет остановлена для регулярного обслуживания. Пожалуйста, сохраните рабочие данные и выйдите из системы.

dc.contributor.author	Zhilibayev, Serik
dc.date.accessioned	2021-08-02T10:17:48Z
dc.date.available	2021-08-02T10:17:48Z
dc.date.issued	2021-08
dc.identifier.citation	"Zhilibayev, S. (2021). Real-time Speech Emotion Recognition (RSER) in Wireless Multimedia Sensor Networks (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstan"	en_US
dc.identifier.uri	http://nur.nu.edu.kz/handle/123456789/5642
dc.description.abstract	Recently Wireless Multimedia Sensor Networks (WMSN) is extensively used and huge amounts of data are generated on daily basis. There are huge processes that have to be monitored in real-time, so preprocessing and fast analysis of raw data is required to be done and stored on the edge. Since, edge computation allows the environment to be decentralized, which makes it highly responsive, low price, scalable, and secure. WMSN and edge computing are important in areas like healthcare where the subject has to be monitored and analyzed continuously. In this work, we propose the healthcare system for monitoring human emotion using speech in realtime (RSER). Firstly, this project aims to analyze state-of-the are SER approaches with respect to time and the ability to work on constrained devices. Secondly, the new approach based on time analysis will be provided. There will be Exploratory data Analysis on multiple datasets that will be used for training such as the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) , Berlin (EMO-DB) and IMEOCAP datasets. Data based on Vocal tract spectrum features and low-level acoustic features (Pitch and energy) will be extracted. The data will be trained and evaluated on Deep Learning and Machine Learning algorithms. Algorithms will be prioritized by their time, energy, and accuracy metrics. Then, this experiment will be tested and evaluated on embedded device (Raspberry PI). Finally, modified model based on algorithm analysis will be tested on 3 Scenarios (Processing on Edge, Processing Sink, and Streaming).	en_US
dc.language.iso	en	en_US
dc.publisher	Nazarbayev University School of Engineering and Digital Sciences	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/us/	*
dc.subject	RSER	en_US
dc.subject	Wireless Multimedia Sensor Networks	en_US
dc.subject	WMSN	en_US
dc.subject	Ryerson Audio-Visual Database of Emotional Speech and Song	en_US
dc.subject	Real-time Speech Emotion Recognition	en_US
dc.subject	Type of access: Open Access	en_US
dc.subject	Research Subject Categories::TECHNOLOGY	en_US
dc.title	REAL-TIME SPEECH EMOTION RECOGNITION (RSER) IN WIRELESS MULTIMEDIA SENSOR NETWORKS	en_US
dc.type	Master's thesis	en_US
workflow.import.source	science