Quantifying the Performance and Resource Usage of HLS4ML's Implementation of the Batch Normalization Layer on FPGAs

Khan, Waiz

Quantifying the Performance and Resource Usage of HLS4ML's Implementation of the Batch Normalization Layer on FPGAs

Files

Khan_washington_0250O_26630.pdf (1.34 MB)

Date

2024-09-09

Authors

Khan, Waiz

Abstract

Field-Programmable Gate Arrays (FPGAs) are a powerful platform for developing hardware implementations of machine learning algorithms. Building these models is time-consuming and requires expertise in hardware design and writing code in Hardware Description Language (HDL). High-level synthesis (HLS) offers a method for developing hardware that does not require the specialized knowledge of FPGAs and HDL, but comes at the cost of not being able to modify the design to take advantage of the resources available. To evaluate models developed with HLS, we used the open-source Python library HLS4ML, which can produce low latency HLS machine learning models. In this thesis, we explore the application of high-level synthesis for machine learning, specifically the batch normalization layer, seeking to evaluate the quality, resource usage, and performance of the models produced using this technique. Our research indicates that HLS designs are efficient but not entirely accurate, whereas the optimized handwritten designs are very accurate, but require more resources.