POSE2ACT: TRANSFORMER-BASED 3D POSE ESTIMATION AND GRAPH CONVOLUTION NETWORKS FOR HUMAN ACTIVITY RECOGNITION

Aimyshev, Dias

POSE2ACT: TRANSFORMER-BASED 3D POSE ESTIMATION AND GRAPH CONVOLUTION NETWORKS FOR HUMAN ACTIVITY RECOGNITION

dc.contributor.author	Aimyshev, Dias
dc.date.accessioned	2023-05-24T10:04:07Z
dc.date.available	2023-05-24T10:04:07Z
dc.date.issued	2023
dc.description.abstract	The rise of deep learning has brought significant attention to two tasks in computer vision: pose estimation and human activity recognition. While human activity recognition has various applications in IoT systems, pose estimation is critical for motion tracking and prediction in virtual and augmented realities, robotics, and other fields. Despite being distinct tasks, they are closely linked, and this study focuses on merging pose estimation, which generates body joint coordinates, and skeleton-based activity recognition, which operates on the given joints. The study uses a visual transformer for 3D pose estimation, viewing joints as spatial features and neighboring frames as temporal features. Meanwhile, graph convolution networks are used for activity recognition based on a 3D skeleton, which has produced state-of-the-art results. However, these outcomes are based on 3D coordinates generated by motion capture systems and have limitations in their applicability and robustness. To overcome these limitations, the two models are merged into a single End2End network. The proposed approach is enhanced by applying various data transformations, modifications, pre-training, and fine-tuning of different architecture components. The research achieves a 90.3% activity recognition cross-subject accuracy score on the NTU RGB+D test dataset, comparable to the state-of-the-art using generated 3D input, and outperforms other models using 2D input by predicting 3D coordinates in the process.	en_US
dc.identifier.citation	Aimyshev, D. (2023). Pose2Act: Transformer-based 3D Pose Estimation and Graph Convolution Networks for Human Activity Recognition. School of Engineering and Digital Sciences	en_US
dc.identifier.uri	http://nur.nu.edu.kz/handle/123456789/7069
dc.language.iso	en	en_US
dc.publisher	School of Engineering and Digital Sciences	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/us/	*
dc.subject	Type of access: Embargo	en_US
dc.subject	pose estimation	en_US
dc.subject	human activity recognition	en_US
dc.subject	3D skeleton	en_US
dc.title	POSE2ACT: TRANSFORMER-BASED 3D POSE ESTIMATION AND GRAPH CONVOLUTION NETWORKS FOR HUMAN ACTIVITY RECOGNITION	en_US
dc.type	Master's thesis	en_US
workflow.import.source	science

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Dias Aimyshev Computer Science Thesis Manuscript.pdf - Dias Aimyshev.pdf
Size:: 1.89 MB
Format:: Adobe Portable Document Format
Description:: thesis

Download

Name:: Dias Aimyshev Computer Science Thesis Presentation - Dias Aimyshev.pptx
Size:: 10.55 MB
Format:: Microsoft Powerpoint XML
Description:: presentation

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.28 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

02. Master's Thesis