Energy-Efficient GPU Frequency Scaling Characterization for SLM Fine-Tuning on Embedded Platforms

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Nazarbayev University School of Engineering and Digital Sciences

Abstract

While embedded GPU dynamic voltage and frequency scaling (DVFS) is well-studied for inference workloads, fine-tuning exhibits different memory access patterns and runs 100–1000× longer, making inference-derived policies inappropriate. We present the first per-frequency characterization of transformer fine-tuning across three model scales (BERT-tiny 14M, BERT-base 110M, DeBERTa-xlarge 900M) on the NVIDIA Jetson AGX Orin, sweeping GPU frequencies from 306 to 1300 MHz on SST-2 and QNLI benchmarks. Across 77 experiments, optimal frequencies fall consistently in the 612–1020 MHz range, with production-scale models achieving 22–32% energy savings over the default governor. We develop a GPU-utilization-guided frequency selection algorithm requiring only 30 profiling steps that achieves a 1.5% average gap from the true optimum across 13 validation workloads, versus 21% energy waste for the default governor.

Description

Citation

Aidar, A. (2026). Energy-Efficient GPU Frequency Scaling Characterization for SLM Fine-Tuning on Embedded Platforms. Nazarbayev University School of Engineering and Digital Sciences

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution 3.0 United States