June 1, 2024
PDFThe National Internet Observatory (NIO) aims to help researchers study online behavior. Participants install a browser extension and/or mobile apps to donate their online activity data along with comprehensive survey responses. The infrastructure will offer approved researchers access to a suite of structured, parsed content data for selected domains to enable analyses and understanding of Internet use in the US. This is all conducted within a robust research ethics framework, emphasizing ongoing informed consent and multiple layers, technical and legal, of interventions to protect the values at stake in data collection, data access, and research. This paper provides a brief overview of the NIO infrastructure, the data collected, the participants, and the researcher intake process.
May 29, 2024
PDFLarge language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.
February 15, 2024
PDFScholars have long recognized that interpersonal networks play a role in mobilizing social movements. Yet, many questions remain. This Element addresses these questions by theorizing about three dimensions of ties: emotionally strong or weak, movement insider or outsider, and ingroup or cross-cleavage. The survey data on the 2020 Black Lives Matter protests show that weak and cross-cleavage ties among outsiders enabled the movement to evolve from a small provocation into a massive national mobilization. In particular, the authors find that Black people mobilized one another through social media and spurred their non-Black friends to protest by sharing their personal encounters with racism. These results depart from the established literature regarding the civil rights movement that emphasizes strong, movement-internal, and racially homogenous ties. The networks that mobilize appear to have changed in the social media era. This title is also available as Open Access on Cambridge Core.
February 2, 2024
PDFStudies of gendered phenomena online have highlighted important disparities, such as who is likely to be elevated as an expert or face gender-based harassment. This research, however, typically relies upon inferring user gender—an act that perpetuates notions of gender as an easily observable, binary construct. Motivated by work in gender and queer studies, we therefore compare common approaches to gender inference in the context of online settings. We demonstrate that gender inference can have downstream consequences when studying gender inequities and find that nonbinary users are consistently likely to be misgendered or overlooked in analysis. In bringing a theoretical focus to this common methodological task, our contribution is in problematizing common measures of gender, encouraging researchers to think critically about what these constructs can and cannot capture, and calling for more research explicitly focused on gendered experiences beyond a binary.
September 22, 2023
PDFCampaign contributions are a staple of congressional life. Yet, the search for tangible effects of congressional donations often focuses on the association between contributions and votes on congressional bills. We present an alternative approach by considering the relationship between money and legislators’ speech. Floor speeches are an important component of congressional behavior, and reflect a legislator’s policy priorities and positions in a way that voting cannot. Our research provides the first comprehensive analysis of the association between a legislator’s campaign donors and the policy issues they prioritize with congressional speech. Ultimately, we find a robust relationship between donors and speech, indicating a more pervasive role of money in politics than previously assumed. We use a machine learning framework on a new dataset that brings together legislator metadata for all representatives in the US House between 1995 and 2018, including committee assignments, legislative speech, donation records, and information about Political Action Committees. We compare information about donations against other potential explanatory variables, such as party affiliation, home state, and committee assignments, and find that donors consistently have the strongest association with legislators’ issue-attention. We further contribute a procedure for identifying speech and donation events that occur in close proximity to one another and share meaningful connections, identifying the proverbial needles in the haystack of speech and donation activity in Congress which may be cases of interest for investigative journalism. Taken together, our framework, data, and findings can help increase the transparency of the role of money in politics.
September 5, 2023
PDFImportance: Marked elevation in levels of depressive symptoms compared with historical norms have been described during the COVID-19 pandemic, and understanding the extent to which these are associated with diminished in-person social interaction could inform public health planning for future pandemics or other disasters.
Objective: To describe the association between living in a US county with diminished mobility during the COVID-19 pandemic and self-reported depressive symptoms, while accounting for potential local and state-level confounding factors.
August 12, 2023
PDFMost prior and current research examining misinformation spread on social media focuses on reports published by 'fake' news sources. These approaches fail to capture another potential form of misinformation with a much larger audience: factual news from mainstream sources ('real' news) repurposed to promote false or misleading narratives. We operationalize narratives using an existing unsupervised NLP technique and examine the narratives present in misinformation content. We find that certain articles from reliable outlets are shared by a disproportionate number of users who also shared fake news on Twitter. We consider these 'real' news articles to be co-shared with fake news. We show that co-shared articles contain existing misinformation narratives at a significantly higher rate than articles from the same reliable outlets that are not co-shared with fake news. This holds true even when articles are chosen following strict criteria of reliability for the outlets and after accounting for the alternative explanation of partisan curation of articles. For example, we observe that a recent article published by The Washington Post titled "Vaccinated people now make up a majority of COVID deaths" was disproportionately shared by Twitter users with a history of sharing anti-vaccine false news reports. Our findings suggest a strategic repurposing of mainstream news by conveyors of misinformation as a way to enhance the reach and persuasiveness of misleading narratives. We also conduct a comprehensive case study to help highlight how such repurposing can happen on Twitter as a consequence of the inclusion of particular narratives in the framing of mainstream news.
July 27, 2023
PDFMany critics raise concerns about the prevalence of ‘echo chambers’ on social media and their potential role in increasing political polarization. However, the lack of available data and the challenges of conducting large-scale field experiments have made it difficult to assess the scope of the problem1,2. Here we present data from 2020 for the entire population of active adult Facebook users in the USA showing that content from ‘like-minded’ sources constitutes the majority of what people see on the platform, although political information and news represent only a small fraction of these exposures. To evaluate a potential response to concerns about the effects of echo chambers, we conducted a multi-wave field experiment on Facebook among 23,377 users for whom we reduced exposure to content from like-minded sources during the 2020 US presidential election by about one-third. We found that the intervention increased their exposure to content from cross-cutting sources and decreased exposure to uncivil language, but had no measurable effects on eight preregistered attitudinal measures such as affective polarization, ideological extremity, candidate evaluations and belief in false claims. These precisely estimated results suggest that although exposure to content from like-minded sources on social media is common, reducing its prevalence during the 2020 US presidential election did not correspondingly reduce polarization in beliefs or attitudes.
June 28, 2023
PDFOver the July Fourth long weekend, people will pour into the small town of Gettysburg, Pennsylvania, to commemorate the 160th anniversary of one of the deadliest battles in U.S. history.
The three-day battle left over 50,000 Union and Confederate soldiers dead, wounded or missing and cemented Gettysburg’s place in American history as the turning point of the Civil War.
A few months after the battle, President Abraham Lincoln visited the town for the dedication of Soldiers’ National Cemetery. There, he delivered his famed Gettysburg Address. Lincoln called on Americans to dedicate themselves to the “unfinished work” for which so many at Gettysburg had died: the preservation of the United States and a “new birth of freedom” for the nation.
I have researched Americans’ support for political violence in my work as a political scientist at Northeastern and Harvard Universities. As an incoming professor at Gettysburg College, which was attacked by Confederate soldiers and served as a makeshift hospital during the battle, I wanted to see whether the legacies of the Civil War still affected Americans’ support for political violence today.
May 24, 2023
PDFIf popular online platforms systematically expose their users to partisan and unreliable news, they could potentially contribute to societal issues such as rising political polarization1,2. This concern is central to the ‘echo chamber’3,4,5 and ‘filter bubble’6,7 debates, which critique the roles that user choice and algorithmic curation play in guiding users to different online information sources8,9,10. These roles can be measured as exposure, defined as the URLs shown to users by online platforms, and engagement, defined as the URLs selected by users. However, owing to the challenges of obtaining ecologically valid exposure data—what real users were shown during their typical platform use—research in this vein typically relies on engagement data4,8,11,12,13,14,15,16 or estimates of hypothetical exposure17,18,19,20,21,22,23. Studies involving ecological exposure have therefore been rare, and largely limited to social media platforms7,24, leaving open questions about web search engines. To address these gaps, we conducted a two-wave study pairing surveys with ecologically valid measures of both exposure and engagement on Google Search during the 2018 and 2020 US elections. In both waves, we found more identity-congruent and unreliable news sources in participants’ engagement choices, both within Google Search and overall, than they were exposed to in their Google Search results. These results indicate that exposure to and engagement with partisan or unreliable news on Google Search are driven not primarily by algorithmic curation but by users’ own choices.