Data-Constrained Model Compression

Horton, Maxwell Christian

Data-Constrained Model Compression

dc.contributor.advisor	Farhadi, Ali
dc.contributor.advisor	Rastegari, Mohammad
dc.contributor.author	Horton, Maxwell Christian
dc.date.accessioned	2022-07-14T22:08:09Z
dc.date.available	2022-07-14T22:08:09Z
dc.date.issued	2022-07-14
dc.date.submitted	2022
dc.description	Thesis (Ph.D.)--University of Washington, 2022
dc.description.abstract	In recent years, strong progress has been made in compressing compute-heavy machine learning models to enable them to execute in real-time on edge devices. Typically, model compression techniques require retraining a model on the original dataset of interest. This is problematic if the original dataset is unavailable due to privacy or legal concerns, or if the model to be compressed was obtained from a third party. We explore the challenges associated with compressing a model in three different data-constrained scenarios. In the first scenario, labels are unavailable. We approach this problem through knowledge distillation, training a smaller model using predictions made from a larger model on unlabeled data. In the second scenario, both data and labels are unavailable. We approach this problem by separately compressing every layer of a pretrained model to obtain a compressed approximation of the original model. Our method is computationally efficient, achieving strong compression rates while maintaining accuracy. In the third scenario, we explore the problem of dynamic, real-time compression after model deployment. We demonstrate a training technique in which we condition a model to achieve high accuracy across a variety of compression levels, allowing for efficient, real-time model selection along the efficiency-accuracy trade-off curve after model deployment. We present these works to elucidate the challenges associated with data-constrained model compression, and to provide solutions for compressing models in these challenging scenarios.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Horton_washington_0250E_24106.pdf
dc.identifier.uri	http://hdl.handle.net/1773/48886
dc.language.iso	en_US
dc.rights	none
dc.subject	compression
dc.subject	deep learning
dc.subject	edge computing
dc.subject	machine learning
dc.subject	pruning
dc.subject	quantization
dc.subject	Computer science
dc.subject.other	Computer science and engineering
dc.title	Data-Constrained Model Compression
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Horton_washington_0250E_24106.pdf
Size:: 3.16 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering