Evaluating the Efficiency of Neural Network Implementations on AMD Versal AI Engines

Shen, Yilin

Evaluating the Efficiency of Neural Network Implementations on AMD Versal AI Engines

Files

Shen_washington_0250O_27728.pdf (6.29 MB)

Date

2025-01-23

Authors

Shen, Yilin

Abstract

The AI Engine (AIE) is an optional component of the AMD Versal Adaptive SoC. It is an innovative device that offers extensive parallelism to enhance compute density and reduce power consumption. However, the performance of the AIE, particularly for small models requiring low latency, remains uncertain.In this research, we mapped three neural network benchmarks to the AIE section of the Versal VCK190. We explored the best coding practices and characteristics of the AIE. Additionally, we mapped these models to the FPGA fabric portion of the VCK190 and compared the cost and performance with our AIE implementation. Based on six metrics, we found that the AIE's efficiency is slightly better than the FPGA fabric in terms of power and silicon area utilization, but worse than the FPGA in terms of performance, resource utilization and price. This discrepancy is due to limitations in interconnection and the inefficiency of hardware units when the vector data path cannot adapt to certain shapes of the input data.