Modelling talker intelligibility variation in a dialect-controlled corpus
McCloy, Daniel Robert
Wright, Richard A.
McGrath, August T. D.
MetadataShow full item record
In a newly created corpus of 3600 read sentences (20 talkers x 180 sentences), considerable variability in talker intelligibility has been found. This variability occurs despite rigorous attempts to ensure uniformity, including strict dialectal criteria in subject selection, speech style guidance with feedback during recording, and head-mounted microphones to ensure consistent signal-to-noise ratio. Nonetheless, we observe dramatic differences in talker intelligibility when the sentences are presented to dialect-matched listeners in noise. We fit a series of linear mixed-effects models using several acoustic characteristics as fixed-effect predictors, with random effects terms controlling for both talker & listener variability. Results indicate that between-talker variability is captured by speech rate, vowel space expansion, and phonemic crowding. These three dimensions account for virtually all of the talker-related variance, obviating the need for a random effect for talker in the model.Vowel space expansion is found to be best captured by polygonal area (contra Bradlow et al 1996), and phonemic overlap is best captured by repulsive force (cf. Liljencrants & Lindblom 1972, Wright 2004). Results are discussed in relation to prior studies of intelligibility.