Human Centered Tools for Analyzing Online Social Data
MetadataShow full item record
In the social sciences, researchers are increasingly turning to datasets collected from social media, online chat, forums, and email to address questions about human communication and behavior. However, these datasets are notoriously difficult to work with. Social media and online communication datasets push the limits of traditional research methods, force researchers to learn an array of new data science skills, and limit open and equitable participation in this important new research area. While this problem has many sides, one of the most significant challenges is a dearth of technological support for online social datasets and mixed methods data analysis processes. Many researchers in this area have to create custom scripts and software for gathering, analyzing, and visualizing their data. Solving this problem depends on understanding the data analysis processes and practices of social scientists working with online social data. In this dissertation, I present an ethnographic interview-based study on the work practices of researchers applying mixed methods to social media data, in order to better understand their data collection and analysis processes and generate implications for design. Even with a good understanding of how social scientists work with data, significant questions remain about how to design helpful software. Based on a year-long engagement with a research group studying emotion in a large chat dataset, I discuss the implications of applying machine learning technology to “amplify” and scale up qualitative analysis from a small manually-coded set to the full corpus. Finally, I discuss two human-centered design projects focused on supporting aspects of the data analysis process: visual exploration of Twitter data, and collaborative qualitative coding of chat messages. This dissertation offers a descriptive understanding of how social scientists actually work with complex social media and online communication datasets, implications for designing better machine learning, visual analytics, and qualitative analysis software, and several open-source tools for analyzing online social data.