Programming Abstractions and Efficient Compilation Techniques for Modern FPGAs
Loading...
Date
Authors
Vega, Luis
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modern field-programmable gate arrays (FPGAs) have recently powered high-profile efficiency gains in systems from datacenters to embedded devices by offering ensembles of heterogeneous, reconfigurable hardware units. Programming stacks for FPGAs, however, are stuck in the past — they are based on traditional hardware languages, which were appropriate when FPGAs were simple, homogeneous fabrics of basic programmable lookup tables. Nowadays, FPGAs are highly heterogeneous architectures that support a wide variety of compute operations such as scalar, vector, fused, and floating-point arithmetic together with different kinds of programmable memories. Unfortunately, the behavioral semantics available in hardware languages today cannot effectively capture these architectural advances, resulting in inefficient programs that are missing all the benefits of specialization. An example of this abstraction gap is that vector operations cannot be described behaviorally for targeting vector hardware (SIMD) available in modern FPGAs. This thesis describes Reticle, a new low-level abstraction for FPGA programming that, unlike existing languages, explicitly represents the special-purpose units available on a particular FPGA device. Reticle has two levels: a portable intermediate language and a target-specific assembly language. The design goal of the intermediate language is to describe behavior, while the assembly language aims for layout. Furthermore, I demonstrate how to lower intermediate programs to assembly programs, using instruction selection which can be both faster and deterministic compared to existing technology mapping approaches. I use Reticle to implement compute centric benchmarks, such as linear algebra operators and coroutines, and find that Reticle compilation runs up to 100 times faster than current approaches while producing comparable or better run-time and utilization. Additionally, I show how using Reticle’s memory instructions can lead to 5x performance improvement on an existing encryption application (AES).
Description
Thesis (Ph.D.)--University of Washington, 2022
