Transfer Learning Using L2 Speech to Improve Automatic Speech Recognition of Dysarthric Speech

dc.contributor.advisorLevow, Gina-Anne
dc.contributor.authorSteinmetz, Hillel Aryeh
dc.date.accessioned2023-08-14T17:06:00Z
dc.date.issued2023-08-14
dc.date.submitted2023
dc.descriptionThesis (Master's)--University of Washington, 2023
dc.description.abstractDysarthria is a class of speech disorders associated with impairments to a person’s motor system. Dysarthric speech is diverse but is broadly characterized by reduced prosodic, phonation, and articulatory precision (Rowe et al., 2022). Non-native English speech, or L2 English speech, shares acoustic and phonetic features with the speech of several dysarthria subtypes, such as slower and more variable speech rate compared to native, non-dysarthric English speech (Baese-Berk and Bradlow, 2021; Hertrich et al., 2021). L2 English speech also has different phonetic correlates than native-English speech, with phonetic variation more closely resembling a speaker’s first language (Flege, 1981). Since L2 speech both shares acoustic features with dysarthric speech and has more diverse phonetic correlates of phonological segments, it should facilitate knowledge transfer when training an ASR model on dysarthric recognition tasks. This study finetunes Wav2vec2 models on two English dysarthric speech datasets, UA-Speech and TORGO, and one English L2 speech dataset, L2-Arctic, using standard finetuning and multitask learning paradigms. It examines whether including L2 speech in the training data improves dysarthric speech recognition in speaker-dependent, speaker-independent, and zero-shot settings. Our results suggest that including L2 speech in the training data improves dysarthric speech recognition in speaker-dependent and speaker-independent settings, with models trained using multitask learning performing better than those trained using standard finetuning.
dc.embargo.lift2024-08-13T17:06:00Z
dc.embargo.termsRestrict to UW for 1 year -- then make Open Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherSteinmetz_washington_0250O_25460.pdf
dc.identifier.urihttp://hdl.handle.net/1773/50477
dc.language.isoen_US
dc.rightsCC BY
dc.subjectASR
dc.subjectdysarthria
dc.subjectdysarthric speech recognition
dc.subjectL2 speech
dc.subjectmultitask learning
dc.subjecttransfer learning
dc.subjectLinguistics
dc.subject.otherLinguistics
dc.titleTransfer Learning Using L2 Speech to Improve Automatic Speech Recognition of Dysarthric Speech
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Steinmetz_washington_0250O_25460.pdf
Size:
595.7 KB
Format:
Adobe Portable Document Format

Collections