Modeling the Perceptual Learning of Novel Dialect Features
MetadataShow full item record
All language use reflects the user's social identity in systematic ways. While humans can easily adapt to this sociolinguistic variation, automatic speech recognition (ASR) systems continue to struggle with it. This dissertation makes three main contributions. The first is to provide evidence that modern state-of-the-art commercial ASR systems continue to perform reliably worse on talkers from some social backgrounds. The second contribution is expanding our understanding of how and when human listeners who have been recently exposed to a new dialect rely more on social information about a talker than the acoustics. While human listeners' perceptions can be categorically shifted by giving them incorrect social information when listening to a new dialect, the same effect is much weaker when listening to their own dialect. The third contribution is computationally modeling listeners' bias towards their own dialect. Models trained using a dataset biased towards one dialect accurately reflected the behavior of listeners from that dialect. Further, explicitly including the dialect from which each training token was drawn during training and providing it at the time of classification improved classification accuracy with the second dialect while maintaining accuracy for the first. This can provide a behaviorally-accountable model for dialect adaptation in automatic speech recognition.
- Linguistics