Discovering and Segmenting Unseen Objects for Robot Perception

relationships.isAuthorOf

Xie, Christopher

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Perception lies at the core of the ability of a robot to function in the real world. As robots become more ubiquitously deployed in unstructured environments such as homes and offices, it is inevitable that robots will en- counter objects that they have not observed before. Thus, in order to interact effectively with such environments, building a robust object recognition module of unseen objects is valuable. Additionally, it can facilitate down- stream tasks including grasping, re-arrangement, and sorting of unseen objects. This is a challenging perception task since the robot needs to learn the concept of “objects” and generalize it to unseen objects. In this thesis, we propose different methods for learning such perception systems by exploiting different visual cues and learning data without man- ual annotations. First, we investigate the use of motion cues for this problem. We develop a novel neural network architecture, PT-RNN, that leverages optical flow by casting the problem as object discovery via foreground mo- tion clustering from videos. This network learns to produce pixel-trajectory embeddings such that clustering them results in segmenting the unseen objects into different instance masks. Next, we introduce UOIS-Net, which separately leverages RGB and depth for unseen object instance segmenta- tion. UOIS-Net is able to learn from synthetic RGB-D data where the RGB is non-photorealistic, and provides state-of-the-art unseen object instance segmentation results in tabletop environments, which are common to robot manipulation. Lastly, we investigate the use of relational inductive biases in the form of graph neural networks in order to better segment unseen object instances. We introduce a novel framework, RICE, that refines a provided instance segmentation by utilizing a graph-based representation. We conclude with a discussion of the proposed work and future direc- tions, which includes a vision of future research that leverages the proposed work to bootstrap a lifelong learning mechanism that renders unseen objects as no longer unseen.

Description

Thesis (Ph.D.)--University of Washington, 2021

Citation

DOI