EVALUATION OF ML INFERENCE WORKLOADS ON REDUCED REFRESH RATE DRAM
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Engineering and Digital Sciences
Abstract
This thesis investigates the potential benefits of reducing DRAM refresh rates to improve the performance of Machine learning and Neural Network inference workloads. The increasing integration of ML and NN models in various industries makes it necessary to optimize these models to efficiently use computing resources, particularly in devices with limited capabilities. To maintain data integrity, DRAM requires periodic refresh cycles, which has a substantial impact on power consumption and system efficiency. Thus, DRAM refresh rates can be lowered for performance purposes. While this study does not add any additional components to the memory controller, other proposed approaches had hardware or software overhead. Preliminary findings indicate that NNs have a remarkable tolerance to data loss, caused by reduced refresh rates. Results show that DRAM refresh rates can be reduced by up to 15-150 times the usual refresh rate without significant impact on NN accuracy. Additionally, NNs showed 2.7% faster inference and consumed 5.6% less power at refresh rate of 1 second.
Description
Citation
Zhakiyev, Daniyar (2024) Evaluation Of ML Inference Workloads On Reduced Refresh Rate Dram. Nazarbayev University School of Engineering and Digital Sciences
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States
