Robust Real-Time Image Processing Through Dynamic Mode Decomposition

Grosek, Jacob

Robust Real-Time Image Processing Through Dynamic Mode Decomposition

Files

Grosek_washington_0250E_12090.pdf (30.23 MB)

Date

2013-11-14

relationships.isAuthorOf

Grosek, Jacob

Abstract

In many areas of research, robust and efficient data-mining, and data-driven modeling, have become essential to progressing our understanding of increasingly complex and nonlinear relations often embedded in cluttered and/or corrupted data. Applications in the industry and technology sectors are demanding high-performance algorithms that can tease out important information in real-time, and help humans interact better with computers. Researchers are starting to explore sparse modeling techniques that can sample sparsely and project onto a low-dimensional bases, where the essential information that drives the system can be extracted without having to build and run costly models of the full dynamics of the entire system. The focus of this work will be on the development of data analysis techniques that can be applied to solving the gesture recognition problem, with the goal of improving performance. One of the most exciting results of the research presented here is found in the use of dynamic mode decomposition (DMD) as a viable and effective method for subtracting out the backgrounds for moving objects in videos. The method will be introduced and compared against the current standard method in the background subtraction field, namely robust principal component analysis (RPCA). The computational speed and accuracy of the DMD separation method offers a comfortable margin for real-time, online data processing. The fact that DMD approximates RPCA's low-rank/sparse separation of data matrices opens the possibilities of various applications in the time-scale separation of dynamics arena, and even in the data compression and sparse sensing fields. It will be shown that through enhanced pre-processing techniques, robust, accurate, and real-time gesture recognition can be achieved, even on heavily down-sampled images where gestures are nearly indistinguishable to the human eye. This is compared to the current trend in the computer vision field that advocates more complicated and computationally expensive feature selection and statistical learning routines that become problem specific and often still suffer from gesture irregularity in the data. Gesture recognition can be further improved by selecting the most appropriate gestures for the given task, accounting for ergonomic, vernacular, and algorithmic considerations. A new method, which combines these subjective and objective constraints into a single measure that indicates which gestures are best for the given application, is developed and tested on a real hand gesture recognition problem. Insights from the way that gestures render themselves in feature space help guarantee that this best lexicon methodology will be useful and reliable for realistic recognition problems. In the appendix of this work, the theoretical background and practical implementation techniques needed in order to model a high-power Raman fiber laser/amplifier system that includes stimulated Brillouin scattering (SBS) and four-wave mixing (FWM) effects will be delineated, and the strengths and weaknesses of the computer model will be discussed.