Topological Representations for Visual Object Recognition in Unseen Indoor Environments

dc.contributor.advisorBanerjee, Ashis G
dc.contributor.authorSamani, Ekta Umesh
dc.date.accessioned2023-09-27T17:21:03Z
dc.date.available2023-09-27T17:21:03Z
dc.date.issued2023-09-27
dc.date.submitted2023
dc.descriptionThesis (Ph.D.)--University of Washington, 2023
dc.description.abstractObject recognition is an essential component of visual perception tasks that help robots build a semantic-level understanding of their environment. Although deep learning methods achieve extraordinary recognition performance in previously seen environments, they are insufficient for deployment in complex and continually-changing environments due to their sensitivity to environmental variations. To realize the goal of long-term autonomy in robots, we need perception methods that go beyond statistical correlation. Therefore, this dissertation focuses on developing robust object recognition methods using topological methods and human-like reasoning mechanisms. We begin by using topologically persistent features, which capture the objects’ 2D shape information for recognition in unseen environments. In particular, we present two kinds of representations, namely, sparse persistence image (PI) and amplitude, computed by applying persistent homology to multi-directional height function-based filtrations (nested sequences of cubical complexes) representing the objects' segmentation maps. Using a benchmark dataset, we demonstrate that sparse PI features show better recognition performance in unseen environments than the features from widely-used deep learning-based feature extractors. On a new dataset, the UW Indoor Scenes (UW-IS) dataset, designed to test object recognition performance in unseen environments, the performance of sparse PI features remains relatively unchanged in an unseen test environment, unlike a state-of-the-art domain-adaptive object detection method. Next, we propose a new descriptor, TOPS, to capture the 3D shape information of point clouds generated from depth images, and an accompanying recognition framework, THOR, inspired by human reasoning. The descriptor employs a novel slicing-based approach to compute topological features from filtrations of simplicial complexes using persistent homology, and facilitates reasoning-based recognition using object unity. Apart from a benchmark dataset, we report performance on a new dataset, the UW Indoor Scenes (UW-IS) Occluded dataset, curated using commodity hardware to reflect real-world scenarios with different environmental conditions and degrees of object occlusion. THOR outperforms state-of-the-art methods on both the datasets and achieves substantially higher recognition accuracy for all the scenarios of the UW-IS Occluded dataset. Subsequently, we extend the TOPS descriptor to incorporate object color information via color embeddings and obtain the TOPS2 descriptor. The color embeddings are computed by leveraging the similarity and connectivity between colors in a color network generated using the Mapper algorithm. The accompanying THOR2 framework, trained entirely on synthetic RGB-D images of unoccluded objects, witnesses considerable performance improvements over the shape-based THOR framework on both the OCID and UW-IS Occluded datasets. THOR2 also achieves substantially higher accuracy than a state-of-the-art vision transformer adapted for RGB-D object recognition on the OCID and UW-IS Occluded dataset, regardless of the camera orientation and environmental conditions, respectively. Therefore, the approaches presented in this work, which have also been successfully implemented on a low-cost robot, lay the foundation for achieving robust object recognition in unseen environments using computational topology tools.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherSamani_washington_0250E_26118.pdf
dc.identifier.urihttp://hdl.handle.net/1773/50871
dc.language.isoen_US
dc.rightsnone
dc.subjectAI-enabled Robotics
dc.subjectObject Recognition
dc.subjectRGB-D Perception
dc.subjectTopological Data Analysis
dc.subjectMechanical engineering
dc.subjectRobotics
dc.subject.otherMechanical engineering
dc.titleTopological Representations for Visual Object Recognition in Unseen Indoor Environments
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Samani_washington_0250E_26118.pdf
Size:
33.08 MB
Format:
Adobe Portable Document Format