You are here

A GPU-based Algorithm-specific Optimization for High-performance Background Subtraction

Background subtraction is an essential first stage in many vision applications differentiating foreground pixels from the background scene, with Mixture of Gaussians (MoG) being a widely used implementation choice. MoG's high computation demand renders a real-time single threaded realization infeasible. With it's pixel level parallelism, deploying MoG on top of parallel architectures such as a Graphics Processing Unit (GPU) is promising. However, MoG poses many challenges having a significant control flow (potentially reducing GPU efficiency) as well as a significant memory bandwidth demand.

In this paper, we propose a GPU implementation of Mixture of Gaussians (MoG) that surpasses real-time processing for full HD (1080p 60 Hz). This paper describes step-wise optimizations starting from general GPU optimizations (such as memory coalescing, computation & communication overlapping), via algorithm-specific optimizations including control flow reduction and register usage optimization, to windowed optimization utilizing shared memory. For each optimization, this paper evaluates the performance potential and identifies architectural bottlenecks. Our CUDA-based implementation improves performance over sequential implementation by 57x, 97x and 101x through general, algorithm-specific, and windowed optimizations respectively, without impact to the output quality. 

Appeared in:
International Conference on Parallel Processing
Presentation Place:
Minneapolis, MN
Related Research:  Computer Vision on GPU

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer