Supporting bioinformatics analysis using a hybrid cloud and HPC architecture

dc.contributor.advisorYeung, Ka Yee
dc.contributor.authorMcKeever, Patrick
dc.date.accessioned2025-05-12T22:43:21Z
dc.date.available2025-05-12T22:43:21Z
dc.date.issued2025-05-12
dc.date.submitted2025
dc.descriptionThesis (Master's)--University of Washington, 2025
dc.description.abstractThe exponential growth of next-generation sequencing data requires novel strategies for storage, transfer, and processing of said data. We present a scheduler a based on the Temporal.io workflow framework which enables two key optimizations of bioinformatics workflows. Firstly, we enable users to transparently map workflow steps to diverse execution environments, including high-performance computing (HPC) resources managed by the SLURM resource manager. When tested on a Bulk RNA sequencing workflow, this feature allows a 26% reduction in credit consumption on the NSF Bridges 2 supercomputer by performing adapter trimming locally and all other steps on the supercomputer. Secondly, we enable asynchronous execution of workflows, a feature which guarantees that workflows will achieve reasonable resource utilization even when the scheduler cannot make use of a system's full RAM and CPU resources. When benchmarked on the same Bulk RNA sequencing workflow, this optimization facilitates a reduction in workflow makespan of between 13% and 23%, depending on the exact workflow configuration. Taken together, these features will enable reductions in the cost and time requirements of bioinformatics pipelines for researchers.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherMcKeever_washington_0250O_27948.pdf
dc.identifier.urihttps://hdl.handle.net/1773/52916
dc.language.isoen_US
dc.rightsCC BY
dc.subjectCloud computing
dc.subjectHPC
dc.subjectRNA sequencing
dc.subjectScheduling
dc.subjectWorkflow
dc.subjectComputer science
dc.subjectBioinformatics
dc.subject.otherComputer Science and Systems
dc.titleSupporting bioinformatics analysis using a hybrid cloud and HPC architecture
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
McKeever_washington_0250O_27948.pdf
Size:
676.64 KB
Format:
Adobe Portable Document Format