Enriching Scientific Paper Embeddings with Citation Context

Loading...
Thumbnail Image

Authors

Henner, Kevin

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Amid profusion of scientific literature, methods to organize and search available papers are quite valuable. Embedded representations of papers have potential to be used as input to a variety of tasks related to research paper search and recommendation. Such methods typically focus on document content, though some incorporate citation information. This citation information, however, is generally treated as fungible, with any citation given equal weight and identical meaning as any other. Recent advances in automated citation classifi- cation allow citations to be classified according how they are used in the citing document. I present a novel method for incorporating intent information into scientific paper embeddings through edge-weighting and concatenation of per-intent node2vec embeddings. Furthermore, I suggest that a hybrid approach, including both text and network data to generate embed- dings can take advantage of both complementary and reinforcing information to provide a fuller embedded representation. I evaluate these embeddings on a set of classification and sequence modeling tasks. The results show a significant improvement in some, but not all cases, suggesting that while the incorporation of citation intent classification into scientific paper embeddings is promising, further work is needed to assess whether it can out-perform state-of-the-art alternatives and to further elucidate its contributions.

Description

Thesis (Master's)--University of Washington, 2019

Citation

DOI

Collections