HACKING NEURAL NETWORKS

Loading...
Thumbnail Image

Date

2021-10

Authors

Sharipov, Rollan

Journal Title

Journal ISSN

Volume Title

Publisher

Nazarbayev University School of Engineering and Digital Sciences

Abstract

Today the amount of applications which use Neural Networks is increasing every day. The scope of use of such applications varies in different spheres such as medicine, economy, education and other fields. The main purpose of such applications is to correctly predict or to classify an input into a set of labels representing a correct treatment for a patient or providing appropriate values in tomorrow’s stock exchange market. Our reliance on such results requires that the application is safe from manipulation. If we assume that someone can change an AI model, used in our application - to produce different results, it can lead to serious consequences. In addition, verification of Neural Network classifiers can be costly. This work studies how Neural Networks accuracy can be affected if some noise is inserted in a Neural Network such as CNN. The noise represents a disruptive information that a potential attacker could add to the neural network in order to control the output. Using the changes in accuracy, we determine what is the correlation between classification mistakes and the magnitude of the noise. We used LeNet model architecture with 3 convolution layers. When adding noise, we applied a mask on each filter and added random normal noise on 10, 20, 30 percent of filter coefficients. The accuracy of the classification using the CNN with the added noise is computed for each noise level. The accuracy was also computed for each output class of the network using a confusion heatmap. Finally we implemented a linear SVM, MLP, Random Forest and Gradient Boost classifiers which were used to determine how accurate the prediction can tell us which image will or won’t be misclassified.

Description

Keywords

AI, artificial intelligence, Type of access: Open Access, LeNet model, Dataset

Citation

Sharipov, R. (2021). Hacking Neural Networks (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstan