Energy vs. Performance of Static Speed Settings: An Argument for a Fine-grain GPU Power Measurement Methodology Why a Fine-grain GPU Power Measurement Methodology is Useful and How to Do It Why your Power Measurement Results may be Misleading

loading page

Ricardo Portillo,
pteller

Abstract

Graphics Processing Units (GPUs) offer significant computational power, and per-core power-efficiency, in a small and compact form factor. However, as a whole, discrete-card GPUs are still relatively power-hungry devices, which complicate their use in energy-constrained high-performance computing (HPC) environments. As a result, power and energy optimization of GPUs for HPC is an important area of research. Unfortunately, although most GPUs offer onboard power measurement capabilities that can be of assistance in such research endeavors, they only offer coarse-grain monitoring of runtime power-usage. This impedes the identification of fine-grain power behavior that could potentially aid to further refine GPU energy efficiency. In addition, for proprietary reasons, the implementation and accuracy of these onboard power-measurement capabilities are often obscure. In lieu of fine-grain and transparent power monitoring on the GPU board, we developed an external setup and methodology to capture GPU power at finer timescales - currently at the microsecond level vs. the onboard millisecond-scale. As this paper describes, our custom setup focuses on accurate and automated monitoring, as well as flexible post-analysis, of black-box GPU-application power-behavior. We demonstrate this by analyzing the performance vs. energy tradeoffs of GPU speed settings for a wide array of GPU benchmarks. This analysis shows that the accuracy of energy calculations is significantly diminished by using coarser grain measurements.