Intelligence Through the Lens of Interaction

Loading...
Thumbnail Image

Authors

Ehsani, kiana

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this thesis, I will discuss the problem of acquiring visual intelligence from the interaction, focusing on two aspects of visual understanding: (1) visual perception and (2) embodied intelligence. To address the first question, I designed experiments to learn visual representations by observing animals and humans interact with the visual world. Further, I investigated the idea of learning perception from hands-on interaction -- acquiring generalizable physical understanding by predicting the forces applied in an observed video and trying to replicate the motion observed in simulation, with no additional supervision provided. To address the second question, I discuss our findings on training intelligent embodied agents using interaction from two perspectives. I designed a training paradigm that enables learning-to-learn from interactions. This training regime helps us to continue to learn from our interactions even during inference time. Moreover, I introduce a visually rich object manipulation framework, ManipulaTHOR, which opens the gate for directly training embodied agents to interact intelligently in a physically realistic environment via low-level object manipulation and navigation.

Description

Thesis (Ph.D.)--University of Washington, 2021

Citation

DOI