Automated Generation of Domain Specific Kernels

dc.contributor.advisorCeze, Luis
dc.contributor.authorCowan, Meghan
dc.date.accessioned2021-08-26T18:08:37Z
dc.date.issued2021-08-26
dc.date.submitted2021
dc.descriptionThesis (Ph.D.)--University of Washington, 2021
dc.description.abstractSeamless gains in performance from technology scaling is coming to an end, but many applications rely on hardware and their compilation stacks to continue improving performance and efficiency. In order to keep up with application compute demands, emerging hardware is becoming more diverse, specialized, and complex. New hardware and accelerators expose programming models that have great potential performance, but are often more restrictive and difficult to program. Oftentimes, even traditional compilers struggle to generate efficient programs for these new programming models, leading to a proliferation of domain specific libraries comprised of hand-optimized kernels that are meticulously tuned to take advantage of the target hardware and avoid any bottlenecks. This thesis argues that we can automatically generate and optimize programs by building domain specific tools that search for efficient code. I show how we can apply two search-based methods, program synthesis and autotuning, to take advantage of application specific optimizations such as efficiently utilizing vector parallelism present in many programming models. In this dissertation, I demonstrate how we can optimize programs in machine learning and Homomorphic Encryption (HE) using search-based methods. First, I show how we can extend existing machine learning compilers to efficiently deploy unconventional neural networks, such as ultra quantized networks, by augmenting core operators with synthesized vector code and intelligently tuned memory access behavior. Then, I present Porcupine, a synthesizng compiler for vectorized HE programs. Porcupine synthesizes vectorized code from a plaintext scalar reference program, and performs instruction selection and scheduling for HE's unique performance model. Together, these two systems show how we can use search to both tune and discover optimizations in programs.
dc.embargo.lift2022-08-26T18:08:37Z
dc.embargo.termsRestrict to UW for 1 year -- then make Open Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherCowan_washington_0250E_22809.pdf
dc.identifier.urihttp://hdl.handle.net/1773/47421
dc.language.isoen_US
dc.rightsCC BY
dc.subject
dc.subjectComputer science
dc.subject.otherComputer science and engineering
dc.titleAutomated Generation of Domain Specific Kernels
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Cowan_washington_0250E_22809.pdf
Size:
8.44 MB
Format:
Adobe Portable Document Format