Effective Use of Cross-Domain Parsing in Automatic Speech Recognition and Error Detection

Marin, Marius

Effective Use of Cross-Domain Parsing in Automatic Speech Recognition and Error Detection

dc.contributor.advisor	Ostendorf, Mari	en_US
dc.contributor.author	Marin, Marius	en_US
dc.date.accessioned	2015-05-11T20:27:52Z
dc.date.available	2015-05-11T20:27:52Z
dc.date.issued	2015-05-11
dc.date.submitted	2015	en_US
dc.description	Thesis (Ph.D.)--University of Washington, 2015	en_US
dc.description.abstract	Automatic speech recognition (ASR), the transcription of human speech into text form, is used in many settings in our society, ranging from customer service applications to personal assistants on mobile devices. In all such settings it is important for the system to know when it is making errors, so that it may ask the user to rephrase or restate their previous utterance. Such errors are often syntactically anomalous. The primary goal of this thesis is to find novel uses of parsing for automatic detection and correction of ASR errors. We start by developing a framework for ASR rescoring and automatic error detection leveraging syntactic parsing in conjunction with a maximum entropy classifier, and find that parsing helps with error detection, even when the parser is trained on out-of-domain data. In particular, features capturing parser reliability are used to improve the detection of out-of-vocabulary (OOV) and name errors. However, parsers trained on out-of-domain treebanks do not provide any benefit to ASR rescoring. This observation motivates our work on domain adaptation of parsing, with the objective of directly improving both transcription accuracy and error detection. We develop two weakly supervised domain adaptation methods which use error labels, but no hand-annotated parses: a self-training approach to directly improve the probabilistic context-free grammar (PCFG) model used in parsing, as well as a novel model combination method using a discriminative log-linear model to augment the generative PCFG. We apply both methods to ASR rescoring and error detection tasks. We find that self-training improves the ability of our parser to select the correct ASR hypothesis. The log-linear adaptation improves both OOV and name error detection tasks, and self-training performed after log-linear adaptation further improves the reliability of the parser, while producing smaller, faster models. Finally, motivated by empirical observations that the presence of names in an utterance is often indicated by words located far apart from the names themselves, we develop a general long-distance phrase pattern learning algorithm using word-level semantic similarity measures, and apply it to the problem of name error detection. This novel feature learning method leads to more robust classification, both when used independently of parsing, and in conjunction with parse features.	en_US
dc.embargo.terms	Open Access	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.other	Marin_washington_0250E_14154.pdf	en_US
dc.identifier.uri	http://hdl.handle.net/1773/33149
dc.language.iso	en_US	en_US
dc.rights	Copyright is held by the individual authors.	en_US
dc.subject	feature learning; machine learning; parsing; speech recognition	en_US
dc.subject.other	Electrical engineering	en_US
dc.subject.other	electrical engineering	en_US
dc.title	Effective Use of Cross-Domain Parsing in Automatic Speech Recognition and Error Detection	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Marin_washington_0250E_14154.pdf
Size:: 1.64 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrical engineering