MRL-AdANNS: Matryoshka Representation Learning for Web-Scale Adaptive Semantic Search

Rege, Aniket

MRL-AdANNS: Matryoshka Representation Learning for Web-Scale Adaptive Semantic Search

dc.contributor.advisor	Farhadi, Ali
dc.contributor.advisor	Shapiro, Linda
dc.contributor.author	Rege, Aniket
dc.date.accessioned	2023-08-14T17:04:29Z
dc.date.available	2023-08-14T17:04:29Z
dc.date.issued	2023-08-14
dc.date.submitted	2023
dc.description	Thesis (Master's)--University of Washington, 2023
dc.description.abstract	Learned representations are essential in modern ML systems, but often struggle to adapt to the required capacity of various downstream tasks. In this thesis, we propose Matryoshka Representation Learning (MRL) [64] to address this challenge, which learns coarse-to-fine representations with minimal overhead to existing representation learning frameworks at no additional training or inference cost. MRL achieves accuracy and robustness comparable to low-dimensional representations, with benefits like up to 14Ã smaller ImageNet-1K embeddings and 14Ã speed-ups for large-scale retrieval. It extends seamlessly to web-scale datasets (ImageNet, JFT) across Vision (ResNet, ViT), Language (BERT), and V+L (ALIGN) modalities. In modern web-scale search systems, rigid high-dimensional representations are learned via a deep encoder and hooked into an approximate nearest neighbor search (ANNS) pipeline to retrieve similar data points. Using these rigid representations is computationally expensive and inflexible to compute-constrained environments. To overcome this, we introduce the novel AdANNS framework [92] to leverage the flexibility of Matryoshka Representations at each stage of the ANNS pipeline and provide compute-aware elastic search. We demonstrate state-of the-art accuracy-compute trade-offs using novel AdANNS-based key ANNS building blocks like search data structures (AdANNS-IVF) [102] and quantization (AdANNS-OPQ) [29]. For example on ImageNet retrieval, AdANNS-IVF is up to 1.5% more accurate than the rigid representations-based IVF [102] at the same compute budget; and matches accuracy while being up to 90Ã faster in wall-clock time. For Natural Questions, 32-byte AdANNS-OPQ matches the accuracy of the 64-byte OPQ baseline [29] constructed using rigid representations – same accuracy at half the cost! We further show that the gains from AdANNS translate to modern-day composite ANNS indices that combine search structures and quantization. Finally, we demonstrate that AdANNS can enable inference-time adaptivity for compute-aware search on ANNS indices built non-adaptively on matryoshka representations. The code is open-sourced at https://github.com/RAIVNLab/MRL and https: //github.com/RAIVNLab/AdANNS.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Rege_washington_0250O_25349.pdf
dc.identifier.uri	http://hdl.handle.net/1773/50379
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	Classification
dc.subject	Computer Vision
dc.subject	Deep Learning
dc.subject	Representation Learning
dc.subject	Retrieval
dc.subject	Search
dc.subject	Electrical engineering
dc.subject	Computer science
dc.subject	Computer engineering
dc.subject.other	Electrical and computer engineering
dc.title	MRL-AdANNS: Matryoshka Representation Learning for Web-Scale Adaptive Semantic Search
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Rege_washington_0250O_25349.pdf
Size:: 7.07 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrical and computer engineering