December 29, 2021
PDFSpringer Link
Given that being misinformed can have negative ramifications, finding optimal corrective techniques has become a key focus of research. In recent years, several divergent correction formats have been proposed as superior based on distinct theoretical frameworks. However, these correction formats have not been compared in controlled settings, so the suggested superiority of each format remains speculative. Across four experiments, the current paper investigated how altering the format of corrections influences people’s subsequent reliance on misinformation. We examined whether myth-first, fact-first, fact-only, or myth-only correction formats were most effective, using a range of different materials and participant pools. Experiments 1 and 2 focused on climate change misconceptions; participants were Qualtrics online panel members and students taking part in a massive open online course, respectively. Experiments 3 and 4 used misconceptions from a diverse set of topics, with Amazon Mechanical Turk crowdworkers and university student participants. We found that the impact of a correction on beliefs and inferential reasoning was largely independent of the specific format used. The clearest evidence for any potential relative superiority emerged in Experiment 4, which found that the myth-first format was more effective at myth correction than the fact-first format after a delayed retention interval. However, in general it appeared that as long as the key ingredients of a correction were presented, format did not make a considerable difference. This suggests that simply providing corrective information, regardless of format, is far more important than how the correction is presented.
August 5, 2021
PDFPublic Opinion Quarterly
Social media data can provide new insights into political phenomena, but users do not always represent people, posts and accounts are not typically linked to demographic variables for use as statistical controls or in subgroup comparisons, and activities on social media can be difficult to interpret. For data scientists, adding demographic variables and comparisons to closed-ended survey responses have the potential to improve interpretations of inferences drawn from social media—for example, through comparisons of online expressions and survey responses, and by assessing associations with offline outcomes like voting. For survey methodologists, adding social media data to surveys allows for rich behavioral measurements, including comparisons of public expressions with attitudes elicited in a structured survey. Here, we evaluate two popular forms of linkages—administrative and survey—focusing on two questions: How does the method of creating a sample of Twitter users affect its behavioral and demographic profile? What are the relative advantages of each of these methods? Our analyses illustrate where and to what extent the sample based on administrative data diverges in demographic and partisan composition from surveyed Twitter users who report being registered to vote. Despite demographic differences, each linkage method results in behaviorally similar samples, especially in activity levels; however, conventionally sized surveys are likely to lack the statistical power to study subgroups and heterogeneity (e.g., comparing conversations of Democrats and Republicans) within even highly salient political topics. We conclude by developing general recommendations for researchers looking to study social media by linking accounts with external benchmark data sources
July 12, 2021
PDFJournal of Experimental Psychology: General
The backfire effect is when a correction increases belief in the very misconception it is attempting to correct, and it is often used as a reason not to correct misinformation. The current study aimed to test whether correcting misinformation increases belief more than a no-correction control. Furthermore, we aimed to examine whether item-level differences in backfire rates were associated with test-retest reliability or theoretically meaningful factors. These factors included worldview-related attributes, namely perceived importance and strength of pre-correction belief, and familiarity-related attributes, namely perceived novelty and the illusory truth effect. In two nearly identical experiments, we conducted a longitudinal pre/post design with N = 388 and 532 participants. Participants rated 21 misinformation items and were assigned to a correction condition or test-retest control. We found that no items backfired more in the correction condition compared to test-retest control or initial belief ratings. Item backfire rates were strongly negatively correlated with item reliability (⍴ = -.61 / -.73) and did not correlate with worldview-related attributes. Familiarity-related attributes were significantly correlated with backfire rate, though they did not consistently account for unique variance beyond reliability. While there have been previous papers highlighting the non-replicable nature of backfire effects, the current findings provide a potential mechanism for this poor replicability. It is crucial for future research into backfire effects to use reliable measures, report the reliability of their measures, and take reliability into account in analyses. Furthermore, fact-checkers and communicators should not avoid giving corrective information due to backfire concerns.
July 1, 2021
PDFAdvances in Neural Information Processing Systems 34 (NeurIPS 2021)
Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference corpus. Contemporary neural topic models surpass classical ones according to these metrics. At the same time, topic model evaluation suffers from a validation gap: automated coherence, developed for classical models, has not been validated using human experimentation for neural models. In addition, a meta-analysis of topic modeling literature reveals a substantial standardization gap in automated topic modeling benchmarks. To address the validation gap, we compare automated coherence with the two most widely accepted human judgment tasks: topic rating and word intrusion. To address the standardization gap, we systematically evaluate a dominant classical model and two state-of-the-art neural models on two commonly used datasets. Automated evaluations declare a winning model when corresponding human evaluations do not, calling into question the validity of fully automatic evaluations independent of human judgments.
June 30, 2021
PDFNature
Science rarely proceeds beyond what scientists can observe and measure, and sometimes what can be observed proceeds far ahead of scientific understanding. The twenty-first century offers such a moment in the study of human societies. A vastly larger share of behaviours is observed today than would have been imaginable at the close of the twentieth century. Our interpersonal communication, our movements and many of our everyday actions, are all potentially accessible for scientific research; sometimes through purposive instrumentation for scientific objectives (for example, satellite imagery), but far more often these objectives are, literally, an afterthought (for example, Twitter data streams). Here we evaluate the potential of this massive instrumentation—the creation of techniques for the structured representation and quantification—of human behaviour through the lens of scientific measurement and its principles. In particular, we focus on the question of how we extract scientific meaning from data that often were not created for such purposes. These data present conceptual, computational and ethical challenges that require a rejuvenation of our scientific theories to keep up with the rapidly changing social realities and our capacities to capture them. We require, in other words, new approaches to manage, use and analyse data.
April 26, 2021
PDFJournal of Quantitative Description: Digital Media
As an integral component of public discourse, Twitter is among the main data sources for scholarship in this area. However, there is much that scholars do not know about the basic mechanisms of public discourse on Twitter, including the prevalence of various modes of communication, the types of posts users make, the engagement those posts receive, or how these things vary with user demographics and across different topical events. This paper broadens our understanding of these aspects of public discourse. We focus on the first nine months of 2020, studying that period as a whole and giving particular attention to two monumentally important topics of that time: the Black Lives Matter movement and the COVID-19 pandemic. Leveraging a panel of 1.6 million Twitter accounts matched to U.S. voting records, we examine the demographics, activity, and engagement of 800,000 American adults who collectively posted nearly 300 million tweets during this time span. We find notable variation in user activity and engagement, in terms of modality (e.g., retweets vs. replies), demographic subgroup, and topical context. We further find that while Twitter can best be understood as a collection of interconnected publics, neither topical nor demographic variation perfectly encapsulates the "Twitter public." Rather, Twitter publics are fluid, contextual communities which form around salient topics and are informed by demographic identities. Together, this paper presents a disaggregated, multifaceted description of the demographics, activity, and engagement of American Twitter users in 2020.
January 13, 2021
PDFOSF Preprints
An individual’s issue preferences are non-separable when they depend on other issue outcomes (Lacy 2001a), presenting measurement challenges for traditional survey research.We extend this logic to the broader case of conditional preferences, in which policy preferences depend on the status of conditions with inherent levels of uncertainty - and are not necessarily policies themselves. We demonstrate new approaches for measuring conditional preferences in two large-scale survey experiments regarding the conditions under which citizens would support reopening schools in their communities during the COVID-19 pandemic. By drawing on recently developed methods at the intersection of machine learning and causal inference, we identify which citizens are most likely to have school reopening preferences that depend on additional considerations. The results highlight the advantages of using such approaches to measure conditional preferences, which represent an under appreciated and general phenomenon in public opinion.
January 8, 2021
PDFNature Communications
While digital trace data from sources like search engines hold enormous potential for tracking and understanding human behavior, these streams of data lack information about the actual experiences of those individuals generating the data. Moreover, most current methods ignore or under-utilize human processing capabilities that allow humans to solve problems not yet solvable by computers (human computation). We demonstrate how behavioral research, linking digital and real-world behavior, along with human computation, can be utilized to improve the performance of studies using digital data streams. This study looks at the use of search data to track prevalence of Influenza-Like Illness (ILI). We build a behavioral model of flu search based on survey data linked to users’ online browsing data. We then utilize human computation for classifying search strings. Leveraging these resources, we construct a tracking model of ILI prevalence that outperforms strong historical benchmarks using only a limited stream of search data and lends itself to tracking ILI in smaller geographic units. While this paper only addresses searches related to ILI, the method we describe has potential for tracking a broad set of phenomena in near real-time.