Metapath of Thoughts: Verbalized Metapaths in Heterogeneous Graph as Contextual Augmentation to LLM

Abstract

Heterogeneous graph neural networks (HGNNs) excel in cap- turing graph topology and structural information. However, they are ineffective in processing the textual components present in nodes and edges and thus producing suboptimal performance in downstream tasks such as node-classification. Additionally, HGNNs lack in their explanatory power and are considered black-box. Although, Large Language models (LLMs) are good at processing textual information, how- ever, utilizing them for tasks like node-prediction can be non-trivial since it is difficult to identify the ideal graphi- cal context and present it in a form suitable for LLMs to consume effectively. We introduce a framework that com- bines the strengths of both models by leveraging the context obtained through metapaths, which are generated during the training of HGNNs. This approach enables the under- standing of complex and indirect relationships between dif- ferent types of nodes. Our novel framework enhances the prediction accuracy of HGNNs and the transparency of their decision-making process through natural language explana- tions provided by LLMs. We demonstrate that our proposed framework outperforms FASTGTN (SOTA on heterogeneous node classification tasks), an HGNN tailored for heteroge- neous graph data, on two network datasets (DBLP citation graph and Goodreads graph dataset), with improvements in F1 score from 0.81 and 0.66 of the baseline to 0.9 and 0.91, respectively. Furthermore, the efficacy of the framework in generating explanations has been evaluated through human evaluation, considering metrics such as helpfulness and fac- tual correctness.

Description

Thesis (Master's)--University of Washington, 2024

Citation

DOI