Dynamic Object Tracking and 3-D Visualization from Big Visual Data
MetadataShow full item record
We propose an automatic system which dynamically tracks video objects (human and vehicle) and create their 3-D visualization from big visual data. Big visual data implies that all videos are collected from either static or mobile surveillance cameras. Our goal is to track a video object within such a surveillance camera network. To achieve this goal, several tracking scenarios must be carefully dealt with. In this work, we focus on tracking under a single static camera, tracking under a single moving camera, and tracking across multiple moving cameras. In the case of tracking under a single static camera, our proposed work is mainly based on constrained multiple-kernel tracking framework. For human tracking, the system adopts a Kalman filter to predict and refine the tracking results. A pre-trained human detector is further applied to solve initial merging issues. For human tracking across multiple static cameras, a self-organized and scalable multiple-camera tracking system that tracks human across cameras with nonoverlapping views is proposed. For vehicle tracking, our proposed approach regards each patch of the 3-D vehicle model as a kernel, and track the kernels under certain constraints facilitated with the 3-D geometry of the vehicle model. Meanwhile, a kernel density estimator is designed to fit the 3-D vehicle model during tracking. By elegant application of the constrained multiple-kernel (CMK) tracking facilitated with the 3-D vehicle model, the vehicles are able to be tracked efficiently and located precisely. As for tracking under a single moving camera, we propose a robust moving platform based object tracking system, and apply to human tracking. Our work effectively integrates Visual Simultaneous Localization And Mapping, pedestrian detection, ground plane estimation, and kernel-based tracking techniques. The proposed system systematically detects the pedestrians from recorded video frames and tracks the pedestrians in the V-SLAM inferred 3-D space via a tracking-by-detection scheme. In order to efficiently associate the detected pedestrian frame-by-frame, we propose a novel tracking framework, combining the CMK tracking and the estimated 3-D (depth) information, to globally optimize the data association between consecutive frames. By taking advantage of the appearance model and 3-D information, the proposed system not only achieves high effectiveness but also handles efficiently occlusion in the tracking. Based on the results of tracking under a single moving camera, we propose a new framework to track on-road pedestrians across multiple driving recorders. More specifically, we treat the problem as a multi-label classification task, determining whether a specific pedestrian belongs to one or several cameras’ field of views by considering the association likelihood of the tracked pedestrians. The likelihood is calculated based on the pedestrians’ motion cues and appearance features, which are necessarily transformed via brightness transfer functions obtained by some available spatially overlapping views for compensating for the diversity of the cameras. When a pedestrian is leaving a camera’s field of view, the proposed framework predicts and interpolates the possible moving trajectories facilitated by an open map service which can provide routing information. Moreover, based on the GPS locations, we can also reconstruct a 3-D visualization on a 3-D virtual real-world environment, so as to show the dynamic scenes of the recorded videos.
- Electrical engineering