Visual Forecasting for Interactive Embodied Agent

dc.contributor.advisorFarhadi, Ali
dc.contributor.advisorMottaghi, Roozbeh
dc.contributor.authorZeng, Kuo-Hao
dc.date.accessioned2023-04-17T18:03:02Z
dc.date.available2023-04-17T18:03:02Z
dc.date.issued2023-04-17
dc.date.submitted2023
dc.descriptionThesis (Ph.D.)--University of Washington, 2023
dc.description.abstractA hallmark of human intelligence is the ability to plan by predicting the future. Equipping artificial agents with such capability is essential for many fields, especially when the agents have to interact with a dynamic, uncertain environment. This thesis explores how to integrate visual forecasting models into embodied agents. Firstly, we introduce how to perform efficient planning based on trajectory forecasting. Specifically, we developed a drone agent to play catch in a simulated environment. The policy realizes efficient planning by using a model-predictive controller (MPC) with a learnable action sampler. The goal is to forecast the trajectory of the object of interest and plan accordingly. Secondly, we present how to achieve an effective interactive visual navigation, in which the agent has to accomplish tasks by changing the environment. We propose a Neural Interaction Engine (NIE) to enable action-centric object state anticipation. The agent decides based on the predicted one-step forward and action-dependent object state with the NIE, allowing it to solve tasks more effectively. Finally, we utilize visual forecasting to adapt agents to unexpected action drifts. To this end, we introduce Action Adaptive Policy (AAP) to enable agents to adapt to unseen drifts at inference. The key idea is to learn and leverage an action-impact embedding using visual forecasting formulation. In this way, the agent learns how to encode the action-impact embedding on-the-fly to adapt to unseen drifts. The experimental results show that our approach consistently performs better across unseen drifts and even works well when some actions are disabled at inference.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherZeng_washington_0250E_25247.pdf
dc.identifier.urihttp://hdl.handle.net/1773/49876
dc.language.isoen_US
dc.rightsnone
dc.subjectEmbodied AI
dc.subjectLearning through Interaction
dc.subjectReinforcement Learning
dc.subjectVisual Forecasting
dc.subjectVisual Reasoning
dc.subjectArtificial intelligence
dc.subjectRobotics
dc.subjectComputer science
dc.subject.otherComputer science and engineering
dc.titleVisual Forecasting for Interactive Embodied Agent
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zeng_washington_0250E_25247.pdf
Size:
12.11 MB
Format:
Adobe Portable Document Format