Using Lexical and Compositional Semantics to Improve HPSG Parse Selection
MetadataShow full item record
Accurate parse ranking is essential for deep linguistic processing applications and is one of the classic problems for academic research in NLP. Despite significant advances, there remains a big need for improvement, especially for domains where gold-standard training data is scarce or unavailable. An overwhelming majority of parse ranking methods today rely on modeling syntactic derivation trees. At the same time, parsers that output semantic representations in addition to syntactic derivations (like the monostratal DELPH-IN HPSG parsers) offer an alternative structure for training the ranking model, which could be further combined with the baseline syntactic model score for re-ranking. This thesis proposes a method for ranking the semantic sentence representations, taking advantage of compositional and lexical semantics. The methodology does not require sense-disambiguated data, and therefore can be adopted without requiring a solution for word sense disambiguation. The approach was evaluated in the context of HPSG parse disambiguation for two different domains, as well as in a cross-domain setting, yielding relative error rate reduction of 11.36% for top-10 parse selection compared to the baseline syntactic derivation-based parse ranking model, and a standalone ranking accuracy approaching the accuracy of the baseline syntactic model in the best setup.
- Linguistics