The detection and tracking of people in meeting contexts using image processing

Algethami, Nahlah
Image processing techniques for object detection and tracking using a top-view camera are presented in this thesis. Finding reliable object features, especially in an environment when distinctive object features are not available or change significantly (e.g., cluttered backgrounds, objects with different sizes) can be challenging for any image processing-based object detection and tracking. Motion-based detection methods offer one of the most robust approaches for detecting moving objects. However, different object speeds and the requirement for a background model of a scene are the main challenges for image processing motion-based detection approaches. We have developed a novel motion detection algorithm entitled Adaptive Accumulated Frame Differencing (AAFD) which detects objects with different motions using adaptive temporal window sizes and with no need for a background model. Quantitative and qualitative evaluation have been performed, and our method achieved the best accuracy (i.e., F-measure) compared with the state-of-the-art mixture of Gaussians (MOG) method. A novel track-by-detection algorithm is proposed. Our algorithm combines motion detection using adaptive accumulated frame differencing (AAFD) with Shi-Tomasi corner detection to robustly track people particularly when their distinctive features are not available. This approach allowed robust blunder recovery and reduced the tracking drift. We show that our approach achieves excellent performance in terms of the multiple object tracking accuracy (MOTA) metrics, and that it is particularly robust to initialisation differences when compared with baseline and state-of-the -art trackers. Using Online Tracking Benchmark (OTB) videos, we also demonstrate that our tracker is very strong in the presence of background clutter, deformation and illumination variation as evidenced on general dataset. The high-level knowledge derived from our course-to-fine tracker is then used to build a robust background model which is particularly capable of modelling the background when foreground objects are present from the start, and in most of the video frames, while not erroneously incorporating slow or intermittently moving objects into the background model. These are key issues discussed in the background subtraction literature. This is achieved by improving the traditional running average method using high-level knowledge of object motion as an analysis step to distinguish foreground and background pixels, feeding this back into the per-pixel model. Experimental results show that our method works well and outperforms the state-of-the-art mixture of Gaussians and running average background subtraction methods.
NUI Galway
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland