Repository logo

Vectorizing Memory Access on HammerBlade Architecture

dc.contributor.advisorMcCourt, Michael
dc.contributor.advisorTaylor, Michael
dc.contributor.authorAthrij, Saahil
dc.date.accessioned2024-04-26T23:20:17Z
dc.date.available2024-04-26T23:20:17Z
dc.date.issued2024-04-26
dc.date.submitted2024
dc.descriptionThesis (Master's)--University of Washington, 2024
dc.description.abstractThe Reduced Instruction Set Computer(RISC)-V architecture, celebrated for its open-source flexibility and modular design, is a cornerstone for modern computing innova-tions. However, RISC-V [99] faces a significant challenge toward widespread adoption be- cause of the sparse availability of high-performance RISC-V processors. HammerBlade[32] is one of the few high-performance RISC-v processors that fills this gap. One of the most significant advances in parallel computer architecture has been the introduction of multi-core systems and Single Instruction Multiple Data (SIMD) processors; HammerBlade uses a multi-core system (2048 cores) to achieve this high level of parallel computing. However,HammerBlade does not currently perform any data-level parallelism. The overarching objective of this thesis is to enhance the processing capabilitiesof the HammerBlade chip by extending the vanilla RISC-V core[32]. The extension supports 128-bit (16-byte-wide) loads, replacing the 32-bit (4-byte-wide) loads. This advancement enables multiprocessing efficiency in a large-scale, 2048-core system. This objective encom-passes a twofold strategy. The first aspect involves a hardware extension of the RISC-V architecture, focusing on implementing 16 Byte Wide Load instructions to enhance datathroughput and processing efficiency in a densely populated multi-core environment. The second aspect of the objective revolves around the critical refinement of the GCC RISC-V compiler [26]. This thesis thoroughly analyzes compiler optimization techniques, particularly emphasizing vectorization[16] strategies that can exploit the newly introduced 16 Byte Wide Load. The goal is to develop a compiler that not only utilizes the architectural nuancesof the enhanced HammerBlade chip but also optimizes code for this unique hardware. By synergizing these hardware and software advancements, the research aims to break new ground in how multiprocessing tasks are managed and executed, setting a precedent for future innovations in the field of computer architecture and compiler design.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherAthrij_washington_0250O_26611.pdf
dc.identifier.urihttp://hdl.handle.net/1773/51355
dc.language.isoen_US
dc.rightsCC BY
dc.subjectautovectorization
dc.subjectCompiler
dc.subjectGCC
dc.subjectHammerblade
dc.subjectNOC
dc.subjectRisc
dc.subjectComputer engineering
dc.subjectComputer science
dc.subject.otherElectrical and computer engineering
dc.titleVectorizing Memory Access on HammerBlade Architecture
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Athrij_washington_0250O_26611.pdf
Size:
1.79 MB
Format:
Adobe Portable Document Format