Action Recognition and Prediction with Applications to Medical Diagnosis and Daily Living

Abstract

The purpose of this research is to explore the possible ways of improving people’s lives using information from static or egocentric (wearable) cameras. The usage of this information can be diagnostic or preventive. In a diagnostic case, we consider medical applications that will make the diagnosis procedure more objective, while disturbing the person minimally. An example scenario is detecting hand tremors of people suffering from Parkinson’s disease or similar illnesses. This can be done with a static camera without having the patient wear any motion sensors or other tools. Our research showed that such a system can be built very reliably using only a static camera observing the patient. With the recent technological developments, egocentric cameras have become a part of our lives. They are designed as tiny cameras that can be worn without disturbing the person. In order to investigate the possible information gain from egocentric cameras, we used a multiple camera setting, where both an egocentric and multiple static cameras exist. Our research showed that when fused correctly, using the information from different types of cameras increases the recognition accuracy of actions. The model we proposed for this task is also suitable for other multi-modal settings. To prove the generality of the proposed model we also tested it on a setting with multiple static cameras and showed state-of-the-art results. Our model learns the importance of each camera in recognizing the actions, and it can also be used to direct the scenes automatically. We created examples of automatically directed scenes to show the concept. We also addressed the problem of improving people’s lives in a preventive way using egocentric cameras. In our work preventive refers to the general notion of reminders that can possibly prevent people from making mistakes that can cause problems. For example, when people are leaving a room while the stove is on, they might be reminded to turn the stove off. We proposed a notification decision mechanism that reasons about interdependencies between actions, checks at every time step whether there is a missing action that should be completed before the ongoing one ends, calculates a cost for missing it, and uses this cost to make a notification decision. Such a notification system requires recognizing the past actions and predicting the online action while segmenting the activity observed so far. For this purpose, we proposed a model that uses standard features and accomplishes these three tasks successfully. We showed promising results on the extremely challenging task of issuing correct and timely reminders on a new egocentric dataset.

Description

Thesis (Ph.D.)--University of Washington, 2015

Citation

DOI