Detecting Adverse Events in Clinical Trial Free Text
Lingren, Todd Gregory
MetadataShow full item record
<bold>Introduction:</bold> In pharmacotherapy cancer clinical trials patients receive frequent outpatient evaluation and monthly inpatient evaluation, as required by the protocol or institutional guidelines. Detection of adverse events (AEs) and adverse drug events (ADEs, caused by the therapy drug) is a manual and costly process and involves chart review. The goal of this thesis is to save resources needed to support a clinical trial by improving the automatic classification of ADEs of clinical notes that document the patient evaluation. To improve the classification I propose using the informativeness of a sentence. The definition of informativeness in this context is any sentence which contains reference to one or more medical conditions. The null hypothesis states that “Classifying sentences into informative and non-informative in the first step of ADE detection will not improve the performance of the ADE classifier”. <bold>Data:</bold> The 1391 notes from ten patients enrolled in Cincinnati Children's Hospital Medical Center pediatric clinical trials are double annotated for ADEs with adjudication by experienced annotators following the guidance of clinical research coordinators. <bold>Methods:</bold> Using the sentence as the base unit for processing, first step of identification of ADE involves the classification of the sentences into informative and non-informative categories. Over 1,200 of the 29,232 sentences contain at least one ADE (positive sentence) and 80% of positive sentences are informative. The results of three support vector machine (SVM) classifiers are compared with one rule classification baseline and one SVM baseline. Three feature selection methods are compared and the chi-square-based approach performs best on the training data. <bold>Results:</bold> The experiment classifiers using informativeness of the sentence are significantly better performing than either baseline method. Experiment 2, which used a four-class SVM had a better positive predictive value (PPV) than experiment 1 (80.4% vs. 70.3 %, respectively) which combined results from two classifiers, one for informative and the other for noninformative sentences. All classifiers (experiment and baseline) showed improved results with chi-square feature selection over a naïve feature selection method. <bold>Conclusion:</bold> Automated ADE detection in pharmacotherapy clinical trial notes is improved by classifying the sentences by informativeness as a first step.
- Linguistics