Programming by demonstration: a machine learning approach
Programming by demonstration (PBD) enables users to construct programs to automate repetitive tasks without writing a line of code. The key idea in PBD is to generalize from the user's demonstration of the program on a concrete example to a robust program that will work in new situations. Previous approaches to PBD have employed heuristic, domain-specific algorithms to generalize from a small number of examples. In this thesis, we formalize programming by demonstration as a machine learning problem: given the changes in the application state that result from the user's demonstrated actions, learn the sequence of instructions that map from one application state to the next. We propose a domain-independent machine learning approach to PBD that is capable of learning useful programs from a small number of examples. This approach addresses two difficult questions: (1) How do we construct the search space of possible program statements? (2) How do we search this large space efficiently?Our solution is based on the concept of version space algebra. Mitchell  formalized concept learning as a search through a version space of hypotheses consistent with the examples. Concept learning may be thought of as learning functions that map from an instance to a binary classification. In this work, we extend version spaces to apply to complex functions: functions that map from one complex object to another. We then present version space algebra, a means for combining several small spaces in order to construct complex version spaces. To illustrate the approach, we describe the SMARTedit programming by demonstration system for learning repetitive text-editing programs. SMARTedit is capable of learning useful programs from as little as a single training example. Finally, we generalize programming by demonstration to the broader problem of learning programs with loops and conditionals from traces of their execution behavior. We demonstrate this generalization with the SMARTpython system that is capable of learning programs with loops and conditionals from traces of the programs' execution.