MACHINE UNLEARNING IN SEMI- SUPERVISED DECISION TREE BASED MODELS

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Nazarbayev University School of Engineering and Digital Sciences

Abstract

Under strict data privacy regulations such as the General Data Protection Regulation, it is important not only to delete data but also to mitigate its impact on machine learning models. These regulations present a challenging problem because of the memorization capabilities of the machine learning models. While retraining models from scratch ensures the removal of the impact of data from the model, their computational cost underscores the need for efficient removal techniques. Existing methods that modify the dataset or target simpler models are less effective for models with hierarchical structures. In these models, even one data point removal can affect split decisions, leading to complete retraining. Moreover, existing methods are not well suited to a semi-supervised setting. This setting involves using both labeled and unlabeled data to shape the model architecture. This thesis addresses these gaps by proposing a novel unlearning method for semi-supervised tree-based models, aiming to improve computational efficiency while retaining the model's predictive performance. This work first analyzes supervised unlearning methods. After that, it adapts these methods to semi-supervised settings. Next, we evaluate the unlearning performance across multiple datasets. The goal is to enable efficient data removal, which avoids full model retraining while preserving the model's predictive performance.

Description

Citation

Nurmaganbet, N. (2025). Machine Unlearning in Semi-Supervised Decision Tree Based Models. Nazarbayev University School of Engineering and Digital Sciences

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States