Deep Learning Methods for Video-Based Human Activity Recognition in Industrial Settings

Parsa, Behnoosh

Deep Learning Methods for Video-Based Human Activity Recognition in Industrial Settings

dc.contributor.advisor	Banerjee, Ashis G
dc.contributor.author	Parsa, Behnoosh
dc.date.accessioned	2021-03-19T22:56:35Z
dc.date.issued	2021-03-19
dc.date.submitted	2020
dc.description	Thesis (Ph.D.)--University of Washington, 2020
dc.description.abstract	With increasingly high interest in assistive robots and smart surveillance systems, we need a powerful perception mechanism to be able to describe the events in a scene. However, achieving accurate perception models is not trivial, since, even for one perception task there are unlimited possible scenarios. Hoping to develop analytically driven models seems too optimistic for such systems; hence, Supervised Learning as a sub-field of function approximation has become very popular in robotic perception. Supervised learning is the task of learning a function that maps an input to an output based on example input-output pairs. Scene understanding is even more involved when it comes to solving Human Action Recognition (HAR) problems. In HAR the task is to classify human activities from an image or determine atomic actions composing the activity in a video. In video-based HAR, there are exponentially many ways that humans can perform the same task. Besides, the variety in posture and speed at which people perform activities makes solving HAR tasks even more challenging. Therefore, models should be designed to learn common underlying spatial and temporal properties of human activity to achieve generalizability. This thesis is dedicated to designing perception models for recognizing human actions and determining the ergonomic risk associated with them. Specifically, Part I focus on solving the Human Activity Segmentation (HAS) problem in long videos, which is the task of semantically segmenting long videos into distinct actions in an offline framework. In Part II, we present our designs for solving online-HAR problems to recognize human activities in the observed batch of frames. Since, the performance of computer vision algorithms also depends on the quality and relevance of the training data, in Part I, we introduce a new dataset for an indoor object manipulation task called the University of Washington Indoor Object Manipulation (UW-IOM).
dc.embargo.lift	2022-03-19T22:56:35Z
dc.embargo.terms	Delay release for 1 year -- then make Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Parsa_washington_0250E_22404.pdf
dc.identifier.uri	http://hdl.handle.net/1773/46847
dc.language.iso	en_US
dc.rights	CC BY-NC
dc.subject	Computer Vision
dc.subject	Deep Learning
dc.subject	Graph Convolutional Networks
dc.subject	Human Activity Recognition
dc.subject	Human Postural Assessment
dc.subject	Video Semantic Segmentation
dc.subject	Mechanical engineering
dc.subject.other	Mechanical engineering
dc.title	Deep Learning Methods for Video-Based Human Activity Recognition in Industrial Settings
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Parsa_washington_0250E_22404.pdf
Size:: 16.87 MB
Format:: Adobe Portable Document Format

Download

Collections

Mechanical engineering