Farhadi, AliMottaghi, RoozbehZeng, Kuo-Hao2023-04-172023-04-172023-04-172023Zeng_washington_0250E_25247.pdfhttp://hdl.handle.net/1773/49876Thesis (Ph.D.)--University of Washington, 2023A hallmark of human intelligence is the ability to plan by predicting the future. Equipping artificial agents with such capability is essential for many fields, especially when the agents have to interact with a dynamic, uncertain environment. This thesis explores how to integrate visual forecasting models into embodied agents. Firstly, we introduce how to perform efficient planning based on trajectory forecasting. Specifically, we developed a drone agent to play catch in a simulated environment. The policy realizes efficient planning by using a model-predictive controller (MPC) with a learnable action sampler. The goal is to forecast the trajectory of the object of interest and plan accordingly. Secondly, we present how to achieve an effective interactive visual navigation, in which the agent has to accomplish tasks by changing the environment. We propose a Neural Interaction Engine (NIE) to enable action-centric object state anticipation. The agent decides based on the predicted one-step forward and action-dependent object state with the NIE, allowing it to solve tasks more effectively. Finally, we utilize visual forecasting to adapt agents to unexpected action drifts. To this end, we introduce Action Adaptive Policy (AAP) to enable agents to adapt to unseen drifts at inference. The key idea is to learn and leverage an action-impact embedding using visual forecasting formulation. In this way, the agent learns how to encode the action-impact embedding on-the-fly to adapt to unseen drifts. The experimental results show that our approach consistently performs better across unseen drifts and even works well when some actions are disabled at inference.application/pdfen-USnoneEmbodied AILearning through InteractionReinforcement LearningVisual ForecastingVisual ReasoningArtificial intelligenceRoboticsComputer scienceComputer science and engineeringVisual Forecasting for Interactive Embodied AgentThesis