Talking to Robots: Learning to Ground Human Language in Perception and Execution
MetadataShow full item record
Advances in computation, sensing, and hardware are enabling robots to perform an increasing variety of tasks in progressively fewer constraints. It is now possible to imagine robots that can operate in traditionally human-centric environments. However, such robots need the flexibility to take instructions and learn about tasks from nonspecialists using language and other natural modalities. At the same time, physically grounded settings provide exciting opportunities for language learning. This thesis describes work on learning to acquire language for human-robot interaction in a physically grounded space. Two use cases are considered: learning to follow route directions through an indoor map, and learning about object attributes from people using unconstrained language and gesture. These problems are challenging because both language and real-world sensing tend to be noisy and ambiguous. This is addressed by reasoning and learning jointly about language and its physical context, parsing into intermediate formal representations that can be interpreted meaningfully by robotic systems. These systems can learn how to follow natural language directions through a map and how to identify objects from human descriptions, even when the underlying concepts are novel to the system, with success rates comparable to or defining the state of the art. Evaluations show that this work takes important steps towards building a robust, flexible, and effective mechanism for bringing together language acquisition and sensing to learn about the world.