Automated Generation of Domain Specific Kernels
| dc.contributor.advisor | Ceze, Luis | |
| dc.contributor.author | Cowan, Meghan | |
| dc.date.accessioned | 2021-08-26T18:08:37Z | |
| dc.date.issued | 2021-08-26 | |
| dc.date.submitted | 2021 | |
| dc.description | Thesis (Ph.D.)--University of Washington, 2021 | |
| dc.description.abstract | Seamless gains in performance from technology scaling is coming to an end, but many applications rely on hardware and their compilation stacks to continue improving performance and efficiency. In order to keep up with application compute demands, emerging hardware is becoming more diverse, specialized, and complex. New hardware and accelerators expose programming models that have great potential performance, but are often more restrictive and difficult to program. Oftentimes, even traditional compilers struggle to generate efficient programs for these new programming models, leading to a proliferation of domain specific libraries comprised of hand-optimized kernels that are meticulously tuned to take advantage of the target hardware and avoid any bottlenecks. This thesis argues that we can automatically generate and optimize programs by building domain specific tools that search for efficient code. I show how we can apply two search-based methods, program synthesis and autotuning, to take advantage of application specific optimizations such as efficiently utilizing vector parallelism present in many programming models. In this dissertation, I demonstrate how we can optimize programs in machine learning and Homomorphic Encryption (HE) using search-based methods. First, I show how we can extend existing machine learning compilers to efficiently deploy unconventional neural networks, such as ultra quantized networks, by augmenting core operators with synthesized vector code and intelligently tuned memory access behavior. Then, I present Porcupine, a synthesizng compiler for vectorized HE programs. Porcupine synthesizes vectorized code from a plaintext scalar reference program, and performs instruction selection and scheduling for HE's unique performance model. Together, these two systems show how we can use search to both tune and discover optimizations in programs. | |
| dc.embargo.lift | 2022-08-26T18:08:37Z | |
| dc.embargo.terms | Restrict to UW for 1 year -- then make Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Cowan_washington_0250E_22809.pdf | |
| dc.identifier.uri | http://hdl.handle.net/1773/47421 | |
| dc.language.iso | en_US | |
| dc.rights | CC BY | |
| dc.subject | ||
| dc.subject | Computer science | |
| dc.subject.other | Computer science and engineering | |
| dc.title | Automated Generation of Domain Specific Kernels | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Cowan_washington_0250E_22809.pdf
- Size:
- 8.44 MB
- Format:
- Adobe Portable Document Format
