REAL-TIME SPEECH EMOTION RECOGNITION (RSER) IN WIRELESS MULTIMEDIA SENSOR NETWORKS

Zhilibayev, Serik

NUR Home
→
01.NU Schools
→
School of Engineering and Digital Sciences
→
Theses and Dissertations
→
View Item

Advanced Search

REAL-TIME SPEECH EMOTION RECOGNITION (RSER) IN WIRELESS MULTIMEDIA SENSOR NETWORKS

Zhilibayev, Serik

URI: http://nur.nu.edu.kz/handle/123456789/5642

Date: 2021-08

Abstract:

Recently Wireless Multimedia Sensor Networks (WMSN) is extensively used and huge amounts of data are generated on daily basis. There are huge processes that have to be monitored in real-time, so preprocessing and fast analysis of raw data is required to be done and stored on the edge. Since, edge computation allows the environment to be decentralized, which makes it highly responsive, low price, scalable, and secure. WMSN and edge computing are important in areas like healthcare where the subject has to be monitored and analyzed continuously. In this work, we propose the healthcare system for monitoring human emotion using speech in realtime (RSER). Firstly, this project aims to analyze state-of-the are SER approaches with respect to time and the ability to work on constrained devices. Secondly, the new approach based on time analysis will be provided. There will be Exploratory data Analysis on multiple datasets that will be used for training such as the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) , Berlin (EMO-DB) and IMEOCAP datasets. Data based on Vocal tract spectrum features and low-level acoustic features (Pitch and energy) will be extracted. The data will be trained and evaluated on Deep Learning and Machine Learning algorithms. Algorithms will be prioritized by their time, energy, and accuracy metrics. Then, this experiment will be tested and evaluated on embedded device (Raspberry PI). Finally, modified model based on algorithm analysis will be tested on 3 Scenarios (Processing on Edge, Processing Sink, and Streaming).

Show full item record