You are here

Studying Inter-Warp Divergence Aware Execution on GPUs

This letter quantitatively studies the benefits of inter-warp divergence aware execution on GPUs. To that end, the letter first proposes a novel approach to quantify the inter-warp divergence by measuring the temporal similarity in execution progress of concurrent warps, which we call Warp Progression Similarity (WPS). Based on WPS metric, this letter proposes a WPS-aware Scheduler (WPSaS) to optimize GPU throughput. The aim is to manage inter-warp divergence to hide the memory access latency and minimize the resource conflicts and temporal under-utilization in compute units allowing GPUs to achieve their peak throughput. The results demonstrate that WPSaS improves throughput by 10% with a pronounced reduction in resource conflicts and temporal under-utilization.

Appeared in:
Computer Architecture Letters
Related Research:  Computer Vision on GPU

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer