You are here

Optimizing GPUs Architectures for General-Purpose Applications

With trend toward using GPUs for a diverse range of applications (e.g. vision and scientific computing) new challenges have been raised. One of the main challenges is inter-warp conflicts in the shared resources including I$, D$ and compute unites. The conflicts in the shared resources mainly caused by inter-warp divergence which is uneven execution progress across the concurrent warps.  Excessive inter-warp divergence may hinder GPUs to achieve their peak throughput. This motivates the need for approaches that manage inter-warp divergence, avoiding I$ conflicts, for divergence-sensitive benchmarks.

Function-Level Processor (FLP)

The success of an FLP depends on the selection of FBs and their potential composition making it a crucial aspect de-fining flexibility and usability. New research is needed that shifts from optimizing individual applications to identifying common functions that are present in many applications of a market. The challenge becomes to define a small enough set of sufficiently composable functions that provide meaningful computation for a given market. Overall, a minimal but sufficiently contiguous set of FBs is desired.

Embedded Vision on Zynq

While vision computing is already a strong research focus, embedded deployment of vision algorithms is still in a fairly early stage. Vision computing increases the demand for extremely high performance, coupled with very low power and desire for low cost. Embedded vision platforms must offer extreme compute performance while consuming very little power (often less than 1 W) and still be sufficiently programmable. The conflicting goals (high performance, low power) imposes massive challenges in architecting embedded vision platforms.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer