Workload-Aware Opportunistic Energy Efficiency in Multi-FPGA Platforms
The continuous growth of big data applications with high computational and scalability demands has resulted in increasing popularity of cloud computing. Optimizing the performance and power consumption of cloud resources is therefore crucial to relieve the costs of data centers. In recent years, multi-FPGA platforms have gained traction in data centers as low-cost yet high-performance solutions particularly as acceleration engines, thanks to the high degree of parallelism they provide. Nonetheless, the size of data centers workloads varies during service time, leading to significant underutilization of computing resources while consuming a large amount of power, which turns out as a key factor of data center inefficiency, regardless of the underlying hardware structure. In this paper, we propose an efficient framework to throttle the power consumption of multi-FPGA platforms by dynamically scaling the voltage and hereby frequency during runtime according to prediction of, and adjustment to the workload level, while maintaining the desired Quality of Service (QoS). This is in contrast to, and more efficient than, conventional approaches that merely scale (i.e., power-gate) the computing nodes or frequency. The proposed framework carefully exploits a pre-characterized library of delay-voltage, and power-voltage information of FPGA resources, which we show is indispensable to obtain the efficient operating point due to the different sensitivity of resources w.r.t. voltage scaling, particularly considering multiple power rails residing in these devices. Our evaluations by implementing state-of-the-art deep neural network accelerators revealed that, providing an average power reduction of 4.0X, the proposed framework surpasses the previous works by 33.6
READ FULL TEXT