Vectorizing Memory Access on HammerBlade Architecture

relationships.isAuthorOf

Athrij, Saahil

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The Reduced Instruction Set Computer(RISC)-V architecture, celebrated for its open-source flexibility and modular design, is a cornerstone for modern computing innova-tions. However, RISC-V [99] faces a significant challenge toward widespread adoption be- cause of the sparse availability of high-performance RISC-V processors. HammerBlade[32] is one of the few high-performance RISC-v processors that fills this gap. One of the most significant advances in parallel computer architecture has been the introduction of multi-core systems and Single Instruction Multiple Data (SIMD) processors; HammerBlade uses a multi-core system (2048 cores) to achieve this high level of parallel computing. However,HammerBlade does not currently perform any data-level parallelism. The overarching objective of this thesis is to enhance the processing capabilitiesof the HammerBlade chip by extending the vanilla RISC-V core[32]. The extension supports 128-bit (16-byte-wide) loads, replacing the 32-bit (4-byte-wide) loads. This advancement enables multiprocessing efficiency in a large-scale, 2048-core system. This objective encom-passes a twofold strategy. The first aspect involves a hardware extension of the RISC-V architecture, focusing on implementing 16 Byte Wide Load instructions to enhance datathroughput and processing efficiency in a densely populated multi-core environment. The second aspect of the objective revolves around the critical refinement of the GCC RISC-V compiler [26]. This thesis thoroughly analyzes compiler optimization techniques, particularly emphasizing vectorization[16] strategies that can exploit the newly introduced 16 Byte Wide Load. The goal is to develop a compiler that not only utilizes the architectural nuancesof the enhanced HammerBlade chip but also optimizes code for this unique hardware. By synergizing these hardware and software advancements, the research aims to break new ground in how multiprocessing tasks are managed and executed, setting a precedent for future innovations in the field of computer architecture and compiler design.

Description

Thesis (Master's)--University of Washington, 2024

Citation

DOI