ResearchWorks Archive
    • Login
    View Item 
    •   ResearchWorks Home
    • Faculty and Researcher Data and Papers
    • Linguistics, Department of
    • Department of Lingustics Faculty and Researcher Data and Papers
    • View Item
    •   ResearchWorks Home
    • Faculty and Researcher Data and Papers
    • Linguistics, Department of
    • Department of Lingustics Faculty and Researcher Data and Papers
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Separating segmental and prosodic contributions to intelligibility

    Thumbnail
    View/Open
    full-sized poster (1.436Mb)
    Date
    2013-09
    Author
    McCloy, Daniel Robert
    Metadata
    Show full item record
    Abstract
    It is well known that the intelligibility of speech can vary both across individuals within styles or tasks, and within individuals across styles or tasks. Various properties of the speech signal have been shown to correlate with such differences in intelligibility, including speech rate, [5,7,8] segmental reduction or deletion, [1] vowel space size, [1,2,4,6] pitch range, [2] and pitch accent deletion. [3] However, these dimensions are rarely (if ever) manipulated independently in natural speech. This poses a challenge to understanding the sources of individual differences in intelligibility (both across individuals and across styles), and makes it difficult to know whether any particular dimension measured causes speech to be more or less intelligible, or merely indexes some other aspect of speech that is responsible for intelligibility differences. As an alternative to measuring fine-grained dimensions of the speech signal, this research makes a broad distinction between prosodic dimensions (pitch, intensity, and duration) on one hand, and segmental content on the other. Through careful resynthesis, a corpus of parallel sentences are created that effectively hold constant either prosody or segmental content across resynthesized “talkers”. High-quality stimuli are achieved by hand-correction of glottal pulse epochs and semi-automated hand segmentation of syllable durations, followed by automated dynamic time warping of durations and swapping of pitch and intensity contours. Results from a speech-in-noise task with both unmodified and resynthesized stimuli show that talkers with low intrinsic intelligibility may have relatively “good” prosody, evidenced by improvements in intelligibility when their prosody is mapped onto other talkers’ waveforms. In contrast, talkers with high intrinsic intelligibility may have relatively “bad” prosody, evidenced by lower intelligibility caused by mapping their prosody onto other talkers. A linear mixed-effects regression model (controlling for signal processing distortion and variation in sentence difficulty) supports this view: patterns of coefficients for “prosodic donor” and “segmental donor” show different rankings than the overall intelligibility scores for unmodified talkers. Comparison between these patterns and post-hoc acoustic analyses of the stimuli allows classification of acoustic predictors based on how well they correlate with “prosodic donor” or “segmental donor” coefficient patterns. References [1] Bond, Z. S., & Moore, T. J. (1994). A note on the acoustic-phonetic characteristics of inadvertently clear speech. Speech Communication, 14(4), 325–337. doi: 10.1016/0167-6393(94)90026-4. [2] Bradlow, A. R., Torretta, G. M., & Pisoni, D. B. (1996). Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics. Speech Communication, 20(3-4), 255–272. doi: 10.1016/S0167-6393(96)00063-5. [3] Clopper, C. G., & Smiljanić, R. (2011). Effects of gender and regional dialect on prosodic patterns in American English. Journal of Phonetics, 39(2), 237–245. doi: 10.1016/j.wocn.2011.02.006. [4] Hazan, V., & Markham, D. (2004). Acoustic-phonetic correlates of talker intelligibility for adults and children. The Journal of the Acoustical Society of America, 116(5), 3108–3118. doi: 10.1121/1.1806826. [5] Mayo, C., Aubanel, V., & Cooke, M. (2012). Effect of prosodic changes on speech intelligibility. Paper presented at the 13th Annual Conference of the International Speech Communication Association. In INTERSPEECH-2012. url: http://interspeech2012.org/accepted-abstract.html?id=661 [6] Neel, A. T. (2008). Vowel space characteristics and vowel identification accuracy. Journal of Speech, Language, and Hearing Research, 51(3), 574–585. doi: 10.1044/1092-4388(2008/041). [7] Sommers, M. S., Nygaard, L. C., & Pisoni, D. B. (1994). Stimulus variability and spoken word recognition I: Effects of variability in speaking rate and overall amplitude. The Journal of the Acoustical Society of America, 96(3), 1314–1324. doi: 10.1121/1.411453. [8] Tolhurst, G. C. (1957). Effects of duration and articulation changes on intelligibility, word reception and listener preference. Journal of Speech and Hearing Disorders, 22(3), 328–334.
    URI
    http://hdl.handle.net/1773/25274
    Collections
    • Department of Lingustics Faculty and Researcher Data and Papers [6]

    DSpace software copyright © 2002-2015  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    @mire NV
     

     

    Browse

    All of ResearchWorksCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    DSpace software copyright © 2002-2015  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    @mire NV