Using Twitter data to identify geographic clustering of anti-vaccination sentiments
MetadataShow full item record
Introduction: Public opinion concerning vaccination is of interest since the publication of a study in 1999 (since retracted) linking the measles, mumps, and rubella (MMR) vaccine to autism; in its wake, parental fear of vaccination has risen, vaccination rates have decreased, and occurrence of outbreaks of vaccine-preventable diseases have increased. I examined vaccination-related opinions using data collected on the social networking site Twitter to determine whether particular geographic areas in the United States expressed more negative sentiment towards vaccination than others. Methods: I tested this hypothesis by combining vaccination-related Twitter data with data published through the National Notifiable Disease Surveillance System, which provides weekly counts of newly diagnosed cases of vaccine-preventable diseases for each state. In the process of working towards this goal, I tested several different sentiment classification methods, collected a new body of vaccination-related Twitter data from 2014, and examined whether the average sentiment expressed on Twitter in 2009 during the H1N1 pandemic was similar to the average sentiment in the same geographic areas in 2014. Results: I was unable to find any meaningful correlation between the average opinion expressed in small geographic units of the United States in 2009 and 2014 or between the average opinion expressed by state and the mumps incidence rate observed over the period 2009-2013. I did note, however, that the proportion of tweets containing negative sentiment (between 5-10%) remained relatively stable in the data collected in 2014, which offers some hope that there is meaningful vaccination-related opinion expressed on Twitter that persists over time. Conclusion: I believe that the lack of correlation observed in this study is a product of the aggregated nature of our outbreak data and differences in the content of negative opinion expressed in 2009 during the H1N1 pandemic and in 2014. Further research on this topic should focus on improving sentiment classification of tweets published when there is not an active pandemic and identifying data sources to validate the use of social media to monitor opinions around vaccination that contain vaccination rate or outbreak data at a more localized geographic level.
- Global health