Applications of Discrepancy Theory to Machine Learning

Heck, Rainie

Applications of Discrepancy Theory to Machine Learning

Files

Heck_washington_0250E_27853.pdf (534.69 KB)

Date

2025-05-12

Authors

Heck, Rainie

Abstract

In the combinatorial discrepancy theory problem, one is given a base set [n] and a collectionof subsets S_1, ... , S_m ⊆ [n] and asked to color the elements of [n] so that each set S_i is as balanced as possible. This simple set-system based question has spawned a multitude of generalizations and found many recent applications in various areas of machine learning. In this dissertation, we introduce the discrepancy problem and its geometric generalization, the vector balancing problem, and then prove two sets of results about applications of the discrepancy problem to machine learning. To conclude, we prove a more abstract result about the vector balancing constant for zonotopes. The first application to machine learning– coresets for kernel density estimators–gives both improved bounds over existing results for a variety of applications of interest, as well as a new chaining-based technique that allows for a more data-driven approach to the problem. The second application–to quantization of neural networks–is a new application of discrepancy theory that provides improvements over existing algorithmic approaches to the problem. Finally, our results for vector balancing for zonotopes address and nearly resolves an open conjecture, leaving only a log log log d gap.