On Fine-Tuning Submodular Functions for Data Subset Selection

dc.contributor.advisorBilmes, Jeffrey
dc.contributor.authorBhalerao, Megh Manoj
dc.date.accessioned2024-09-09T23:08:18Z
dc.date.issued2024-09-09
dc.date.submitted2024
dc.descriptionThesis (Master's)--University of Washington, 2024
dc.description.abstractWe demonstrate that submodular functions, with fine-tuned hyperparameters, serve as extremely effectivedata subset (i.e., summary) selectors, better than the current state-of-the-art, for training machine learning systems on data subsets. To search and reduce the hyperparameter space, we introduce meta-summarization a technique designed to enhance computational efficiency of hyperparameter tuning. Meta-summarization chooses a subset of summaries based on their inter-summary diversity starting from a large set of generated summary candidates. This significantly reduces the summaries to train on relative to training on all of them. This approach enables meta-summarization to find the best performing hyperparameters for a submodular function faster than other hyperparameter search techniques, significantly reducing computation and time. We demonstrate that summaries generated using fine-tuned submodular functions outperform subset selection benchmarks such as DC-Bench (by ≈ 3% absolute) and DeepCore (by ≈ 2% absolute). Fine tuned submodular functions also outperform random and state-of-the-art k-means based subset selection for training a popular ViT-based (vision transformer) architecture, DaViT, on ImageNet, thus setting a new state-of-the-art for supervised subset selection.
dc.embargo.lift2025-09-09T23:08:18Z
dc.embargo.termsRestrict to UW for 1 year -- then make Open Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherBhalerao_washington_0250O_26698.pdf
dc.identifier.urihttps://hdl.handle.net/1773/51966
dc.language.isoen_US
dc.rightsnone
dc.subjectData Subset Selection
dc.subjectDeep Learning
dc.subjectMachine Learning
dc.subjectSubmodular Functions
dc.subjectComputer science
dc.subject.otherElectrical and computer engineering
dc.titleOn Fine-Tuning Submodular Functions for Data Subset Selection
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bhalerao_washington_0250O_26698.pdf
Size:
2.66 MB
Format:
Adobe Portable Document Format