Um and Uh, and the Expression of Stance in Conversational Speech
Le Grezause, Esther Sylviane
MetadataShow full item record
Um and uh are some of the most frequent items in spoken American and British English (Biber et al., 1999). They have been traditionally treated as disfluencies but recent research has focused on their discursive functions and acoustic properties, and suggest that um and uh are not just filled pauses or random speech errors. At this stage, there is little agreement on whether they should be considered as by-products of the planning process (speech errors) or as pragmatic markers. In addition, most work on um and uh considers them to be the same variable, collapsing both into the same category. The present work investigates the discursive and the acoustic properties of um and uh in spontaneous speech with the aim of finding out if they occur in systematic ways and if they correlate with specific variables. The analysis of um and uh is conducted on two corpora, ATAROS and Switchboard, to determine how the markers are used in different spontaneous speech activities. The Switchboard corpus consists of phone conversations between strangers, which allow us to study how speakers use um and uh in this context. It has different transcript versions (original and corrected), which allows us to test how transcribers perceive the two markers by aligning the original transcript with the corrected one. The ATAROS corpus consists of collaborative tasks between strangers and it is annotated for stance strength and polarity, which allows us to investigate how um and uh relate to stance. The term stance refers to subjective spoken attitudes toward something (Haddington, 2004). Stance strength is the degree to which stance is expressed. Stance strength has four possible values: no stance, weak, moderate, and strong stance. Stance polarity is the direction of the expression of stance, and it can be positive, neutral, or negative. The results of this study show that um and uh have di erent discursive cues, which cor- relate with variables such as speaker, speaker gender, speaker involvement, and naturalness of the conversation. Um and uh have di erent acoustic cues, which show some correlation with di erent degrees of stance and with stance polarity for di erent acoustic properties depending on the marker. The presence and the position of um and uh in utterances af- fect the likelihood of an utterance to be marked with a certain degree or polarity of stance. These findings are incorporated in a classification experiment, to test whether information pertaining to um and uh can be used to train a classifier to automatically label stance. The results of this experiment reveal that um and uh are valuable word unigram features and indicate that the position features and certain acoustic features increase the performance of the system in predicting stance. The results also indicate that um is a slightly more im- portant word unigram feature than uh, that features pertaining to um are more informative in the prediction of binary stance, and that features relating to uh are more informative to predict three-way stance strength. The findings confirm that um and uh are distinct entities. The discourse and acoustic features of um and uh are different. The marker um tends to vary to a greater extent than the marker uh. Transcribers perceive um more reliably than uh. Um and uh are relevant word unigram features. Features associated to um increase accuracy over those related to uh to predict binary stress, and features associated to uh increase accuracy over those associated to um to predict three-way stance strength. The work presented in this dissertation provides support to show um and uh are not just fillers or disfluencies, but rather that they have a wide range of uses, from fillers to pragmatic and stance markers.
- Linguistics