Enriching Scientific Paper Embeddings with Citation Context
Loading...
Date
Authors
Henner, Kevin
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Amid profusion of scientific literature, methods to organize and search available papers are quite valuable. Embedded representations of papers have potential to be used as input to a variety of tasks related to research paper search and recommendation. Such methods typically focus on document content, though some incorporate citation information. This citation information, however, is generally treated as fungible, with any citation given equal weight and identical meaning as any other. Recent advances in automated citation classifi- cation allow citations to be classified according how they are used in the citing document. I present a novel method for incorporating intent information into scientific paper embeddings through edge-weighting and concatenation of per-intent node2vec embeddings. Furthermore, I suggest that a hybrid approach, including both text and network data to generate embed- dings can take advantage of both complementary and reinforcing information to provide a fuller embedded representation. I evaluate these embeddings on a set of classification and sequence modeling tasks. The results show a significant improvement in some, but not all cases, suggesting that while the incorporation of citation intent classification into scientific paper embeddings is promising, further work is needed to assess whether it can out-perform state-of-the-art alternatives and to further elucidate its contributions.
Description
Thesis (Master's)--University of Washington, 2019
