Vectorizing Memory Access on HammerBlade Architecture

Athrij, Saahil

Vectorizing Memory Access on HammerBlade Architecture

dc.contributor.advisor	McCourt, Michael
dc.contributor.advisor	Taylor, Michael
dc.contributor.author	Athrij, Saahil
dc.date.accessioned	2024-04-26T23:20:17Z
dc.date.available	2024-04-26T23:20:17Z
dc.date.issued	2024-04-26
dc.date.submitted	2024
dc.description	Thesis (Master's)--University of Washington, 2024
dc.description.abstract	The Reduced Instruction Set Computer(RISC)-V architecture, celebrated for its open-source flexibility and modular design, is a cornerstone for modern computing innova-tions. However, RISC-V [99] faces a significant challenge toward widespread adoption be- cause of the sparse availability of high-performance RISC-V processors. HammerBlade[32] is one of the few high-performance RISC-v processors that fills this gap. One of the most significant advances in parallel computer architecture has been the introduction of multi-core systems and Single Instruction Multiple Data (SIMD) processors; HammerBlade uses a multi-core system (2048 cores) to achieve this high level of parallel computing. However,HammerBlade does not currently perform any data-level parallelism. The overarching objective of this thesis is to enhance the processing capabilitiesof the HammerBlade chip by extending the vanilla RISC-V core[32]. The extension supports 128-bit (16-byte-wide) loads, replacing the 32-bit (4-byte-wide) loads. This advancement enables multiprocessing efficiency in a large-scale, 2048-core system. This objective encom-passes a twofold strategy. The first aspect involves a hardware extension of the RISC-V architecture, focusing on implementing 16 Byte Wide Load instructions to enhance datathroughput and processing efficiency in a densely populated multi-core environment. The second aspect of the objective revolves around the critical refinement of the GCC RISC-V compiler [26]. This thesis thoroughly analyzes compiler optimization techniques, particularly emphasizing vectorization[16] strategies that can exploit the newly introduced 16 Byte Wide Load. The goal is to develop a compiler that not only utilizes the architectural nuancesof the enhanced HammerBlade chip but also optimizes code for this unique hardware. By synergizing these hardware and software advancements, the research aims to break new ground in how multiprocessing tasks are managed and executed, setting a precedent for future innovations in the field of computer architecture and compiler design.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Athrij_washington_0250O_26611.pdf
dc.identifier.uri	http://hdl.handle.net/1773/51355
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	autovectorization
dc.subject	Compiler
dc.subject	GCC
dc.subject	Hammerblade
dc.subject	NOC
dc.subject	Risc
dc.subject	Computer engineering
dc.subject	Computer science
dc.subject.other	Electrical and computer engineering
dc.title	Vectorizing Memory Access on HammerBlade Architecture
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Athrij_washington_0250O_26611.pdf
Size:: 1.79 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrical and computer engineering