Information Measures and Deep Learning Techniques for Network Inference

Loading...
Thumbnail Image

Authors

Rahimzamani, Arman

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The pursuit of uncovering causal relationships among random variables through observational data is a longstanding and captivating focus in graphical models. While the formalization of causality has commenced only recently, this field continues to grapple with numerous intricate conceptual challenges.We study the problem of causal inference in different settings, primarily centered on the applications in genomics, specifically focusing on the challenges and opportunities presented by single-cell transcriptome sequencing experiments (scRNA-seq). In our study: (1) We introduce Graph Divergence Measure (GDM) to measure the compatibility of the independently observed samples to a hypothetical graphical structure. We design novel estimators for GDM which we show to be able to handle various data types and prove them to be consistent. We show GDM’s performance superiority to the existing estimators. (2) We address the problem of inferring the causal relationships in a dynamical system from the time series data by introducing Restricted Directed Information (RDI) and show that it recovers the graph correctly in all deterministic or stochastic instances. We then define the Potential Conditional Mutual Information (qCMI) as the conditional mutual information calculated with a modified joint prior distribution. Based on qCMI with a uniform prior, we introduce uniform RDI (uRDI) to alleviate the effect of sample bias in causal inference and showcase its superior performance via numerical experiments. (3) Based on RDI and uRDI we develop Scribe, a toolkit for detecting and visualizing causal regulatory interactions between genes, and explore the potential for single-cell experiments to power network reconstruction. We show via the numerical examples that inference is possible whenever there is coupling between samples in time and gene expression space domains. We show the supremacy of our method over conventional causal inference methods such as Granger Causality and CCM and discuss its shortcomings and potential caveats. (4) We introduce Dynode, a deep-learning-based software package to broaden the scope of understanding the governing dynamics of cell development processes. The goal is not just to identify causal relationships but also to predict the future state of individual cells based on their current state. Dynode consists of two toolboxes named Dynode Vectorfield and Dynode Spatiotemporal. Dynode Vectorfield is designed to learn the temporal dynamics of individual cells as isolated ecosystems, considering only their internal state. It is demonstrated with synthetic and real datasets and offers practical insights through differential analysis tools. Dynode Spatiotemporal, on the other hand, extends the analysis to include interactions between cells, considering chemical signaling and therefore both time and space dynamics. It is also showcased with synthetic and real-world datasets while discussing its limitations and potential challenges.

Description

Thesis (Ph.D.)--University of Washington, 2023

Citation

DOI