PERFORMANCE OPTIMIZATION OF NEUROEVOLUTION FOR IMPROVED PROGNOSIS OF THE BREAST CANCER
Loading...
Date
2020
Authors
Abdikenov, Beibit
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Engineering and Digital Sciences
Abstract
Cancer is the second largest cause of mortality, responsible for one in every six deaths globally. Cancer has a significant socio-economic impact and its global cost is estimated to be close to $150 billion. Breast cancer is the most common female cancer and its high incidence places it among Kazakhstan’s most challenging public health problems. Advances in computing and sensing technologies and increased storage availability means that vast quantities of data are now available. While the data is sure to help practitioners understand what causes breast cancer and the best treatment approaches, the number of oncologists understanding its use is limited. Accurate and reliable prognoses are increasingly difficult because of the enormous amounts of data about breast cancer and the low survival rates. The available data’s heterogeneity adds to the challenges for data analytics posed by sheer data volume. Moreover, categorical variables in the heterogeneous dataset require accurate pre-processing if enhanced interpretation is to make progress towards prognosis possible. An advanced research in estimating the missing values in databases is also introduced in this thesis work.
Rigorous research efforts have brought about the development of a novel entity embedding scheme based on neural networks capable of addressing effectively the encoding of categorical variables with high cardinality during the presented research. Employing our proposed scheme, it is now possible to represent the categorical variables as real values in high-dimensional space capable of greatly improved interpretation.
Neuroevolution, which is a Meta heuristic approach, has been suggested through our work as a robust way of modelling prognosis from the breast cancer database. Neuroevolution also results in multiple equitable solutions of DNNs (Deep Neural Networks) thereby providing users with many options to choose from. Neuroevolution performance has been optimized using the EAs (Evolutionary Algorithms), namely, MOEA/D, NSGAIII, and SPEA2, but this research revealed a number of limitations in existing EAs and so this thesis proposes an improved EA: FIEA (Fuzzy Inspired Evolutionary Algorithm) which uses a fuzzy analytical approach to perform multi-criteria optimization and is also instrumental in selecting a final DNN model from the Pareto optimal set. This approach also provides insight into how the hyper-parameters control accuracy, sensitivity, F1 and other performance metrics. This is a change from traditional approaches which apply DNNs as a black box. The interpretability improved in this way can be used to advance or adjust DNNs’ behaviour and there is evidence that FIEA-optimized DNNs perform better than other algorithms described in the literature.
Description
Keywords
Breast cancer, public health problems, sensing technologies, computing technologies, survival rates, heterogeneous dataset, Research Subject Categories::TECHNOLOGY, Research Subject Categories::MEDICINE