AN EXPLORATION OF VIDEO TRANSFORMERS FOR FEW-SHOT ACTION RECOGNITION

dc.contributor.authorAikyn, Nartay
dc.date.accessioned2024-05-24T11:46:16Z
dc.date.available2024-05-24T11:46:16Z
dc.date.issued2024-04-29
dc.description.abstractAction recognition is an essential task in computer vision with many applications in various fields. However, recognizing actions in videos using few examples, often referred to as Few-Shot Learning (FSL), is a challenging problem due to the high dimensionality and temporal complexity of video data. This work aims to address this problem by proposing a novel meta-learning framework that integrates Video Transformer as the feature backbone. Video Transformer can capture long-range dependencies and model temporal relationships effectively, thus enriching the global representation. Extensive experiments on benchmark datasets demonstrate that our approach achieves remarkable performance, surpassing baseline models and obtaining competitive results compared to state-of-the-art models. Additionally, we investigate the impact of supervised and self-supervised learning on video representation and evaluate the transferability of the learned representations in cross-domain scenarios. Our approach suggests a promising direction for exploring the combination of meta-learning with Video Transformer in the context of few-shot learning tasks, potentially contributing to the field of action recognition in various domains.en_US
dc.identifier.citationAikyn, N. (2024). An Exploration of Video Transformers for Few-Shot Action Recognition (thesis). Nazarbayev University School of Engineering and Digital Sciencesen_US
dc.identifier.urihttp://nur.nu.edu.kz/handle/123456789/7713
dc.language.isoenen_US
dc.publisherNazarbayev University School of Engineering and Digital Sciencesen_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjecttype of access: restricted accessen_US
dc.subjectdeep learningen_US
dc.subjecthuman action recognitionen_US
dc.subjectfew-shot learningen_US
dc.subjectvideo transformeren_US
dc.subjectself-supervised learningen_US
dc.subjectcross-domain experimenten_US
dc.titleAN EXPLORATION OF VIDEO TRANSFORMERS FOR FEW-SHOT ACTION RECOGNITIONen_US
dc.typeMaster's thesisen_US
workflow.import.sourcescience

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis_Nartay_Aikyn.pdf
Size:
1.71 MB
Format:
Adobe Portable Document Format
Description:
Thesis main article