High accuracy nanopore sequencing of xenonucleic acids using deep learning

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Unnatural base pairing xenonucleic acids (ubp XNA or XNAs) are synthetic nucleotides that form orthogonal base pairs to the standard bases. By expanding chemical and structural diversity, ubp XNAs can improve a wide range of biotechnologies including therapeutics, diagnostics, and digital data storage. Aptamers are a particularly promising application as ubp XNA incorporation can improve target affinity or enable novel molecular interactions. However, XNA aptamer discovery remains limited by the lack of generalizable, high-throughput sequencing technologies. Existing approaches are often indirect, labor-intensive, or not generalizable to any sequence context. One third-generation sequencing platform, protein-based nanopore sequencing, has shown promise as a general solution for sequencing XNAs. With nanopore sequencing, XNAs can be directly sequenced at the single-molecule level without amplification or need for specialized reagents. This advancement is enabled by specialized basecalling models that handle the conversion of an observed ionic current to nucleic acid sequence. In this thesis, I present a deep learning-based XNA basecaller capable of end-to-end processing from raw signal to sequence. I first trained “single-context” XNA basecalling models to assess the viability of a deep learning-based approach. I then built generalized “all-context” XNA basecalling models capable of de novo XNA sequencing. To support downstream applications in XNA aptamer discovery, I developed a complementary clustering and enrichment quantification workflow that leverages nanopore sequencing for per-round, real-time analysis of selection dynamics. This work establishes a robust framework for XNA sequencing that decreases technological barriers to developing XNA-based biotechnologies.

Description

Thesis (Master's)--University of Washington, 2025

Citation

DOI