Energy-Efficient GPU Frequency Scaling Characterization for SLM Fine-Tuning on Embedded Platforms
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Engineering and Digital Sciences
Abstract
While embedded GPU dynamic voltage and frequency scaling (DVFS) is well-studied for inference workloads, fine-tuning exhibits different memory access patterns and runs 100–1000× longer, making inference-derived policies inappropriate. We present the first per-frequency characterization of transformer fine-tuning across three model scales (BERT-tiny 14M, BERT-base 110M, DeBERTa-xlarge 900M) on the NVIDIA Jetson AGX Orin, sweeping GPU frequencies from 306 to 1300 MHz on SST-2 and QNLI benchmarks. Across 77 experiments, optimal frequencies fall consistently in the 612–1020 MHz range, with production-scale models achieving 22–32% energy savings over the default governor. We develop a GPU-utilization-guided frequency selection algorithm requiring only 30 profiling steps that achieves a 1.5% average gap from the true optimum across 13 validation workloads, versus 21% energy waste for the default governor.
Description
Keywords
Citation
Aidar, A. (2026). Energy-Efficient GPU Frequency Scaling Characterization for SLM Fine-Tuning on Embedded Platforms. Nazarbayev University School of Engineering and Digital Sciences
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution 3.0 United States
