Defining, Extracting, and Applying Events in NLP Tasks for Clinical Corpora

Klassen, Prescott

Defining, Extracting, and Applying Events in NLP Tasks for Clinical Corpora

Files

Klassen_washington_0250E_16571.pdf (4.36 MB)

Date

2017-02-14

relationships.isAuthorOf

Klassen, Prescott

Abstract

This dissertation explores defining, extracting, and applying clinical events in three studies of applied clinical natural language processing (NLP)---pneumonia report classification, acquired lung injury (ALI) report classification, and critical follow-up recommendation sentence identification. The goals of the research are to: (1) define a set of events for the clinical domain, (2) develop a clinical corpus of event annotations, (3) extract event representations from clinical records, and (4) apply event representations to multiple NLP tasks in the clinical domain. In order to repeat processes and adapt general research methodologies to the specific requirements of each study, a framework is created for event analysis, corpus development, event detection, and event-based feature extraction. The pneumonia report classification study introduces a sub-corpus of rationale text snippets extracted from a corpus of X-ray reports, labeled for suspicion of pneumonia (PNA) and Cardio Pulmonary Infection Score (CPIS), and annotated for change-of-state and diagnosis events. Events are realized as dependency trees in the annotation and a three-stage event detection process is developed to extract events: (1) rationale snippet prediction by maximum entropy-based text classification, (2) conditional random fields (CRF) named entity recognition (NER), and (3) relation extraction (RE) by dependency parsing. Event-based features are generated from the change-of-state and diagnosis event dependency trees and their performance, alone and in combination with baseline n-gram features, is evaluated in pneumonia report classification experiments. In final experimental results, incorporating chi-squared-ranked feature selection and an optimal feature selection threshold, event-based features in combination with baseline n-gram features perform best for both PNA (F-score +.5) and CPIS (F-score +2.0) labels. To explore the adaptability of the change-of-state and diagnosis events to other disease detection tasks, the second study applies the three-stage event detection process and modules from the pneumonia report classification study to the task of ALI report classification. In final experimental results, incorporating chi-squared-ranked feature selection and an optimal feature selection threshold, change-of-state and diagnosis event-based features, alone and in combination with baseline n-gram features, do not improve the overall performance over the baseline (F-score -.6). In the third and final study, an alternate, non-dependency tree-based model for event representation is explored for critical follow-up recommendation sentence identification. A corpus of 8,000 radiology reports from multiple institutions and across twelve modalities, is annotated for: (1) critical recommendation sentences, (2) entities that provide a reason for the recommendation, a suggested follow-up test, and a recommended timeframe for the follow-up test, as well as (3) a four-label category of criticality and importance. To improve the performance of recommendation sentence categorization, a template of report properties, metadata, named entities, and default computed values is aggregated into an event structure for feature extraction and compared against and in combination with baseline n-gram features in classification experiments. In final experimental results, select event-based features in combination with baseline n-gram features, incorporating chi-squared-ranked feature selection and an optimal feature selection threshold, perform best (F-score +2.0) in critical recommendation classification experiments. The research in this dissertation demonstrates that event-based features, when combined with other types of features, such as n-grams, can improve the performance of applied clinical NLP classification tasks. Simple models for events, such as the dependency tree structures for change-of-state and diagnosis events described in this study, make the annotation of events and event detection with off-the-shelf open-source tools easy to explain and straightforward to implement. The release to the Web of a general research framework, an annotated corpus for change-of-state and diagnosis events, an annotation schema and guidelines, and event detection modules based on open source software, provides opportunity for other researchers to extend and adapt the research presented in this study.