Reconfigurable Convolution Implementation for CNNs in FPGAs
Bannon, Jesse Michael
MetadataShow full item record
Deep learning continues to be the revolutionary method used in pattern recognition applications including image, video, and speech processing. Convolutional Neural Networks (CNNs) in particular have outperformed every competitor in image classification benchmarks, but suffer from high computation and storage complexities. It is becoming more apparent to extend this breakthrough technology to embedded applications that demand low power and mission critical response times. Consequently, embedded CNNs deployed on the edge require compact platforms capable of accelerated computing. Previous works have explored methods to optimize convolution computation within Field Programmable Gate Arrays (FPGAs). Many of which only consider supporting a single CNN architecture. While this approach allows precise optimizations structured around a specific CNN, it restricts the FPGA from updating its model without tremendous compile times upwards to hours. In this work, we explore state-of-the-art CNN-FPGA architectures and implement our own reconfigurable convolution computation unit (CCU) using the Intel High Level Synthesis (HLS) Compiler using a sliding window-based implementation. Results show our CCU does not suffice for real-time computations on the edge.