Patch and Model Size Characterization for On-device Efficient-ViTs on Small Datasets using 12 Quantitative Metrics
| dc.contributor.author | Jurn-Gyu Park | |
| dc.contributor.author | Aidar Amangeldi | |
| dc.contributor.author | Nail Fakhrutdinov | |
| dc.contributor.author | Meruyert Karzhaubayeva | |
| dc.contributor.author | Dimitrios Zorbas | |
| dc.date.accessioned | 2025-08-26T11:30:50Z | |
| dc.date.available | 2025-08-26T11:30:50Z | |
| dc.date.issued | 2025-01-01 | |
| dc.description.abstract | Vision transformers (ViTs) have emerged as a successful alternative to convolutional neural networks (CNNs) in deep learning (DL) applications for computer vision (CV), particularly excelling in accuracy on large‑scale datasets within high‑performance computing (HPC) or cloud domains. However, in the context of resource‑constrained mobile and edge AI devices, there is a lack of systematic and comprehensive investigations into the challenging optimizations for both device‑agnostic (e.g., accuracy and model size) and device‑related (e.g., latency, memory usage, and power/energy consumption) multi‑objectives. To resolve this problem, we first introduce five device‑agnostic (DA) and seven device‑related (DR) quantitative metrics, using which we thoroughly characterize the effects of ViT hyper‑parameters on small datasets in terms of patch size and model size, and then propose a simple yet effective optimization technique called the hierarchical and local (HelLo) tuning method for efficient ViTs. The results show that our method achieves significant improvements of up to 85% in MACs, 67.2% in inference latency, 77.7% in train latency/time, 63.3% in GPU memory, 73.8% in energy consumption, and 263.0% in FoM, with minimal accuracy degradation (up to 2%). | en |
| dc.identifier.citation | Park Jurn-Gyu, Amangeldi Aidar, Fakhrutdinov Nail, Karzhaubayeva Meruyert, Zorbas Dimitrios. (2025). Patch and Model Size Characterization for On-Device Efficient-ViTs on Small Datasets Using 12 Quantitative Metrics. IEEE Access. https://doi.org/10.1109/access.2025.3536471 | en |
| dc.identifier.doi | 10.1109/access.2025.3536471 | |
| dc.identifier.uri | https://doi.org/10.1109/access.2025.3536471 | |
| dc.identifier.uri | https://nur.nu.edu.kz/handle/123456789/10371 | |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | |
| dc.rights | Open access | en |
| dc.source | (2025) | en |
| dc.subject | Computer science | en |
| dc.subject | Characterization (materials science) | en |
| dc.subject | Data mining | en |
| dc.subject | Materials science | en |
| dc.subject | Nanotechnology; type of access: open access | en |
| dc.title | Patch and Model Size Characterization for On-device Efficient-ViTs on Small Datasets using 12 Quantitative Metrics | en |
| dc.type | article | en |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Patch_and_Model_Size_Characterization_for_On-Device_Efficient-ViTs_on_Small_Datasets_Using_12_Quantitative_Metrics__5166c527.pdf
- Size:
- 5.61 MB
- Format:
- Adobe Portable Document Format