CA-zkFL: Class-Aware Zero-Knowledge Federated Learning with Blockchain for In-Hospital Mortality Prediction
Loading...
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Engineering and Digital Sciences
Abstract
Healthcare digitization has created an immense volume of patient data, stored at various health centers. This information has a great prospect to develop predictive models, especially in-hospital mortality prediction in cardiac intensive care units. Nevertheless, patient information is highly guarded in the privacy of laws like HIPAA and GDPR, and thus hospitals cannot distribute raw patient data to collaboratively train models. This poses a major obstacle to the development of accurate, reliable, efficient and generalizable clinical prediction models. Federated learning has been found as a privacy-preserving system, as it allows many hospitals to collaboratively predictive models without sharing raw sensitive information. Nevertheless, even conventional or traditional federated approaches continue to have three significant constraints in the context of healthcare. They apply standardized aggregation mechanisms that do not take into account the imbalance of classes when aggregating hospitals, do not provide cryptographic guarantees of the integrity of the model updates, and rely on a single trusted central server with no transparent and auditable history of the aggregation process.
The study proposed a Class-Aware Zero-Knowledge Federated Learning framework (CA-zkFL), which addresses the three constraints in one system. The framework adds a class-aware aggregation, whereby aggregation weights are given more heavily to hospitals where the prevalence of local mortality is higher, and the small minority class is better performed on. Additionally, this framework uses Pedersen zero-knowledge proofs to cryptographically authenticate the changes in the model of each hospital prior to aggregation without revealing any local data. Moreover, It also incorporates a blockchain-based audit ledger, which documents all the verifiable updates and global model state in a decentralized and tamper-proof fashion. The framework is tested on the cardiac ICU cohort that was sampled out of MIMIC-IV v2.2 clinical database. The sample size is 27,008 patient records that have 29 clinical characteristics (features) and are divided into four simulated hospital clients with a non-IID distribution, and in-hospital mortality rate of 9.32 percent. The experiments are run over 20 federated communication rounds and compared to five baselines, or, put in other terms, single-hospital logistic regression model, FedAvg, FedProx, FedNova, and q-FedAvg.
CA-zkFL demonstrates the optimal and excellent performance results in all clinically critical parameters. It achieves an accuracy score of 0.9263, precision of 0.5782, a recall of 0.7714, an F1-score of 0.6610 and ROC-AUC of 0.9436, which is better than all baselines. The computationally light blockchain and zero-knowledge proofs have a proof generation of an average of 0.0067 seconds (6.6751 milliseconds) per client and block mining overhead of about 0.01836 seconds (18.36 milliseconds) per round. These results show that class-sensitive aggregation, cryptographic verification and decentralized auditing can be successfully combined in a federated learning pipeline without impairing the predictive performance. The proposed framework presents a realistic and reliable basis of privacy-sensitive mortality prediction in the multi-hospital healthcare settings.
Description
Citation
Rahim, A. (2026). CA-zkFL: Class-aware zero-knowledge federated learning with blockchain for in-hospital mortality prediction. Nazarbayev University School of Engineering and Digital Sciences
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution-ShareAlike 3.0 United States
