Active Learning and Submodular Functions
Guillory, Andrew Russell
MetadataShow full item record
Active learning is a machine learning setting where the learning algorithm decides what data is labeled. Submodular functions are a class of set functions for which many optimization problems have efficient exact or approximate algorithms. We examine their connections. 1. We propose a new class of interactive submodular optimization problems which connect and generalize submodular optimization and active learning over a finite query set. We derive greedy algorithms with approximately optimal worst-case cost. These analyses apply to exact learning, approximate learning, learning in the presence of adversarial noise, and applications that mix learning and covering. 2. We consider active learning in a batch, transductive setting where the learning algorithm selects a set of examples to be labeled at once. In this setting we derive new error bounds which use symmetric submodular functions for regularization, and we give algorithms which approximately minimize these bounds. 3. We consider a repeated active learning setting where the learning algorithm solves a sequence of related learning problems. We propose an approach to this problem based on a new online prediction version of submodular set cover. A common theme in these results is the use of tools from submodular optimization to extend the breadth and depth of learning theory with an emphasis on non-stochastic settings.