HACKING NEURAL NETWORKS
Loading...
Date
2021-10
Authors
Sharipov, Rollan
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Engineering and Digital Sciences
Abstract
Today the amount of applications which use Neural Networks is increasing every day.
The scope of use of such applications varies in different spheres such as medicine, economy,
education and other fields. The main purpose of such applications is to correctly
predict or to classify an input into a set of labels representing a correct treatment
for a patient or providing appropriate values in tomorrow’s stock exchange market.
Our reliance on such results requires that the application is safe from manipulation.
If we assume that someone can change an AI model, used in our application - to
produce different results, it can lead to serious consequences. In addition, verification
of Neural Network classifiers can be costly. This work studies how Neural Networks
accuracy can be affected if some noise is inserted in a Neural Network such as CNN.
The noise represents a disruptive information that a potential attacker could add to
the neural network in order to control the output. Using the changes in accuracy, we
determine what is the correlation between classification mistakes and the magnitude
of the noise. We used LeNet model architecture with 3 convolution layers. When
adding noise, we applied a mask on each filter and added random normal noise on
10, 20, 30 percent of filter coefficients. The accuracy of the classification using the
CNN with the added noise is computed for each noise level. The accuracy was also
computed for each output class of the network using a confusion heatmap. Finally
we implemented a linear SVM, MLP, Random Forest and Gradient Boost classifiers
which were used to determine how accurate the prediction can tell us which image
will or won’t be misclassified.
Description
Keywords
AI, artificial intelligence, Type of access: Open Access, LeNet model, Dataset
Citation
Sharipov, R. (2021). Hacking Neural Networks (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstan