Designing for User-facing Uncertainty in Everyday Sensing and Prediction
Kay, Matthew Jeremy Shaver
MetadataShow full item record
As we reach the boundaries of sensing systems, we are increasingly building and deploying ubiquitous computing solutions that rely heavily on inference. This is a natural trend given that sensors have physical limitations in what they can actually sense. Often, there is also a strong desire for simple sensors to reduce cost and deployment burden. Examples include using low-cost accelerometers to track step count or sleep quality (Fitbit), using microphones for cough tracking  or fall detection , and using electrical noise and water pressure monitoring to track appliances’ water and electricity use . A common thread runs across these systems: they rely on inference, hence their output has uncertainty—it has an associated error that researchers attempt to minimize—and this uncertainty is user-facing—it directly affects the quality of the user experience. As we push more of these sorts of sensing and prediction systems into our everyday day lives, we should ask: does the uncertainty in their output affect how people use and trust these systems? If so, how can we design them to be better? While existing literature suggests communicating uncertainty may affect trust, little evidence of this effect in real, deployed systems has been available. I wanted to start with the simplest, most ubiquitous sensing system I could think of to see if a failure to communicate uncertainty might affect trust or usage. I turned to the home weight scale, a sensing system that has been with us for at least 100 years, but which maintains at the moment of measurement about the simplest feedback interface possible: it just tells you your weight! I conducted a series of studies, starting with qualitative investigations into people’s understanding of uncertainty in weight scales, through a study of actual variability in scale measurements, to an online survey of attitudes towards scales from a large number of scale users. I found that many people judge their scales by their perceived uncertainty, but often confuse aspects of error like bias and variance when doing so. I found that people who have a better understanding of weight variability trust their scales more. I found that the scale does little to help people along in their understanding of error at the moment of weigh-in. The design of the scale is today causing problems with trust and abandonment. How can we design it better? To investigate how to communicate uncertainty effectively to people, I turned to another domain: realtime transit prediction. I wanted to understand how to communicate uncertainty in a way that people grasp more viscerally. I sought to develop a visualization of a continuous measure (time to arrival) that capitalizes on a frequency-based (or discrete) representation to improve people’s probabilistic estimates. I introduced a novel visualization technique, quantile dotplots, that capitalizes on both a discrete presentation of uncertainty and people’s fast-counting abilities (called subitizing) to improve the precision of their probabilistic estimates by about 15%. If my work in weight scales asks what happens when we get it wrong, this work aims to build an understanding of how to get it right. Finally, effective design for user-facing uncertainty is not limited to the visual communication of uncertainty: it is not enough to slap an effective visualization on top of whatever model exists in a predictive system. In even conceptually simple prediction tasks—is it going to rain today?—people exhibit preferences for different types of errors. In precipitation prediction, this manifests as wet bias —the tendency of weather forecasters to over-predict the probability of rain to reduce the chance that people do not prepare for rain and then blame the forecaster when they are rained on. Even an effective representation of uncertainty in this case might not be optimal if the model is not tuned to reflect people’s error preferences. I propose a method for systematically estimating people’s error preferences using a simple survey and a construct I call acceptability of error. This survey instrument connects classifier evaluation metrics with acceptability and intent to use through the Technology Acceptance Model, and conceptually helps assign costs in classification in accordance with people’s preferences for different types of errors. Through face validation of the instrument I found that it is sensitive to differences in people’s preferences induced by different types of user interfaces applied to the same problem. Effective design for user-facing uncertainty is only going to become more important. Someone today might abandon their weight scale, or bus tracking app, or weather forecaster due to too much perceived error. In the future, people will be abandoning smartphone and smartwatch apps for tracking blood sugar, or heartrate variability, or blood pressure. As we push more of these sorts of low-cost sensing and prediction applications into the world, we are introducing more estimates with more error into people’s lives (your smartwatch blood pressure will not be as good as a blood pressure cuff). My work aims to help us design systems that bring users’ understanding along into this new world.