New Algorithmic Tools for Distributed Similarity Search and Edge Estimation

dc.contributor.advisorBeame, Paul
dc.contributor.authorRashtchian, Cyrus
dc.date.accessioned2018-07-31T21:11:05Z
dc.date.available2018-07-31T21:11:05Z
dc.date.issued2018-07-31
dc.date.submitted2018
dc.descriptionThesis (Ph.D.)--University of Washington, 2018
dc.description.abstractWe present several foundational results on computational questions related to similarity search, clustering, and parameter estimation. The problems center around the theme of improving algorithms by utilizing geometric or graphical structure. Some contributions include: - Improved upper and lower bounds for computing a similarity join under Hamming distance in a simultaneous distributed model. The core of our analysis involves novel connections between similarity joins and extremal graph theory. - An edge-isoperimetric inequality for powers of the binary hypercube. The insights here help us to develop new similarity join algorithms that are nearly-optimal for a theoretical MapReduce model. - A distributed clustering algorithm for edit distance, with applications to DNA data storage. By using random structure found in real datasets, we achieve new hashing, embedding, and convergence results for an otherwise challenging clustering problem. - The first polylogarithmic query algorithm for estimating the number of edges in a graph using a natural graph query. Our randomized, adaptive algorithm uses bipartite independent set queries to quickly learn an unknown graph.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherRashtchian_washington_0250E_18811.pdf
dc.identifier.urihttp://hdl.handle.net/1773/42267
dc.language.isoen_US
dc.rightsCC BY-NC-ND
dc.subjectClustering
dc.subjectDNA Data Storage
dc.subjectEdge-Isoperimetric
dc.subjectIndependent Set
dc.subjectMapReduce
dc.subjectSimilarity Search
dc.subjectComputer science
dc.subjectMathematics
dc.subject.otherComputer science and engineering
dc.titleNew Algorithmic Tools for Distributed Similarity Search and Edge Estimation
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Rashtchian_washington_0250E_18811.pdf
Size:
2.12 MB
Format:
Adobe Portable Document Format