DSpace Repository

MALWARE CLASSIFICATION OF DECOMPILED WINDOWS EXECUTABLES USING TRANSFER LEARNING TECHNIQUES

Система будет остановлена для регулярного обслуживания. Пожалуйста, сохраните рабочие данные и выйдите из системы.

Show simple item record

dc.contributor.author Dyussekeyev, Askar
dc.date.accessioned 2022-06-10T04:47:51Z
dc.date.available 2022-06-10T04:47:51Z
dc.date.issued 2022-05
dc.identifier.citation Dyussekeyev, A. (2022). Malware Classification of Decompiled Windows Executables using Transfer Learning Techniques (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstan en_US
dc.identifier.uri http://nur.nu.edu.kz/handle/123456789/6202
dc.description.abstract Malicious software is recognized as a threat at both the individual and national levels due to the threat to critical infrastructures such as energy systems and communication networks, which are increasingly subject to probes and attacks. The high variability and creativity of malware strategies and anti-detection techniques significantly complicate their detection by traditional methods. Machine learning has previously proven to be an effective method to resolve malware classification issues. We propose using the transfer learning technique to employ promising algorithms from other problem domains and test their efficacy in the problem of malware detection. In this paper, we consider two approaches that utilize pre-trained models from the domains of Computer Vision (CV) and Natural Language Processing (NLP) and apply them to a malware classification problem. The industry-leading decompiler IDA Pro is applied to convert binary samples to decompiled codes. Then, the text classification model CodeBERT was applied to classify various malware families. The primary difference of such design from previous papers is introducing the decompiling stage, which will allow the application of techniques from the text processing domain. Regarding the Computer Vision method, each word of decompiled code is encoded into a 3-bytes form followed by transforming the resulting bytes sequence into an RGB image. Then, we apply fine-tuned variations of the state-of-the-art model for Computer Vision. The main contributions of this paper are introducing the decompilation state of Windows binaries and assessing current state-of-the-art models from NLP and CV domains for malware classification. In the last section, a comparison of the baseline and implemented techniques during this research is provided. Moreover, opportunities and limitations of presented approaches are provided as well as implications and proposals for further research. en_US
dc.language.iso en en_US
dc.publisher Nazarbayev University School of Engineering and Digital Sciences en_US
dc.rights Attribution-NonCommercial-ShareAlike 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/us/ *
dc.subject Type of access: Gated Access en_US
dc.subject Natural Language Processing en_US
dc.subject Computer Vision en_US
dc.subject Research Subject Categories::TECHNOLOGY en_US
dc.subject CV en_US
dc.subject CodeBERT en_US
dc.subject IDA Pro en_US
dc.subject Transfer Learning Techniques en_US
dc.subject Malware en_US
dc.title MALWARE CLASSIFICATION OF DECOMPILED WINDOWS EXECUTABLES USING TRANSFER LEARNING TECHNIQUES en_US
dc.type Master's thesis en_US
workflow.import.source science


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States