Topological Representations for Visual Object Recognition in Unseen Indoor Environments

Samani, Ekta Umesh

Topological Representations for Visual Object Recognition in Unseen Indoor Environments

dc.contributor.advisor	Banerjee, Ashis G
dc.contributor.author	Samani, Ekta Umesh
dc.date.accessioned	2023-09-27T17:21:03Z
dc.date.available	2023-09-27T17:21:03Z
dc.date.issued	2023-09-27
dc.date.submitted	2023
dc.description	Thesis (Ph.D.)--University of Washington, 2023
dc.description.abstract	Object recognition is an essential component of visual perception tasks that help robots build a semantic-level understanding of their environment. Although deep learning methods achieve extraordinary recognition performance in previously seen environments, they are insufficient for deployment in complex and continually-changing environments due to their sensitivity to environmental variations. To realize the goal of long-term autonomy in robots, we need perception methods that go beyond statistical correlation. Therefore, this dissertation focuses on developing robust object recognition methods using topological methods and human-like reasoning mechanisms. We begin by using topologically persistent features, which capture the objects’ 2D shape information for recognition in unseen environments. In particular, we present two kinds of representations, namely, sparse persistence image (PI) and amplitude, computed by applying persistent homology to multi-directional height function-based filtrations (nested sequences of cubical complexes) representing the objects' segmentation maps. Using a benchmark dataset, we demonstrate that sparse PI features show better recognition performance in unseen environments than the features from widely-used deep learning-based feature extractors. On a new dataset, the UW Indoor Scenes (UW-IS) dataset, designed to test object recognition performance in unseen environments, the performance of sparse PI features remains relatively unchanged in an unseen test environment, unlike a state-of-the-art domain-adaptive object detection method. Next, we propose a new descriptor, TOPS, to capture the 3D shape information of point clouds generated from depth images, and an accompanying recognition framework, THOR, inspired by human reasoning. The descriptor employs a novel slicing-based approach to compute topological features from filtrations of simplicial complexes using persistent homology, and facilitates reasoning-based recognition using object unity. Apart from a benchmark dataset, we report performance on a new dataset, the UW Indoor Scenes (UW-IS) Occluded dataset, curated using commodity hardware to reflect real-world scenarios with different environmental conditions and degrees of object occlusion. THOR outperforms state-of-the-art methods on both the datasets and achieves substantially higher recognition accuracy for all the scenarios of the UW-IS Occluded dataset. Subsequently, we extend the TOPS descriptor to incorporate object color information via color embeddings and obtain the TOPS2 descriptor. The color embeddings are computed by leveraging the similarity and connectivity between colors in a color network generated using the Mapper algorithm. The accompanying THOR2 framework, trained entirely on synthetic RGB-D images of unoccluded objects, witnesses considerable performance improvements over the shape-based THOR framework on both the OCID and UW-IS Occluded datasets. THOR2 also achieves substantially higher accuracy than a state-of-the-art vision transformer adapted for RGB-D object recognition on the OCID and UW-IS Occluded dataset, regardless of the camera orientation and environmental conditions, respectively. Therefore, the approaches presented in this work, which have also been successfully implemented on a low-cost robot, lay the foundation for achieving robust object recognition in unseen environments using computational topology tools.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Samani_washington_0250E_26118.pdf
dc.identifier.uri	http://hdl.handle.net/1773/50871
dc.language.iso	en_US
dc.rights	none
dc.subject	AI-enabled Robotics
dc.subject	Object Recognition
dc.subject	RGB-D Perception
dc.subject	Topological Data Analysis
dc.subject	Mechanical engineering
dc.subject	Robotics
dc.subject.other	Mechanical engineering
dc.title	Topological Representations for Visual Object Recognition in Unseen Indoor Environments
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Samani_washington_0250E_26118.pdf
Size:: 33.08 MB
Format:: Adobe Portable Document Format

Download

Collections

Mechanical engineering