Publications

Recent publications

August 25, 2024

PDF
Mauricio Santillana, Ata A. Uslu, Tamanna Urmi, Alexi Quintana, James N. Druckman, Katherine Ognyanova, Matthew Baum, Roy H. Perlis, David Lazer

Abstract

+

Importance Identifying and tracking new infections during an emerging pandemic is crucial to design and deploy interventions to protect populations and mitigate its effects, yet it remains a challenging task.

Objective To characterize the ability of non-probability online surveys to longitudinally estimate the number of COVID-19 infections in the population both in the presence and absence of institutionalized testing.

Design Internet-based non-probability surveys were conducted, using the PureSpectrum survey vendor, approximately every 6 weeks between April 2020 and January 2023. They collected information on COVID-19 infections with representative state-level quotas applied to balance age, gender, race and ethnicity, and geographic distribution. Data from this survey were compared to institutional case counts collected by Johns Hopkins University and wastewater surveillance data for SARS-CoV-2 from Biobot Analytics.

Setting Population-based online non-probability survey conducted for a multi-university consortium —the Covid States Project.

Participants Residents of age 18+ across 50 US states and the District of Columbia in the US.

August 24, 2024

PDF
Roy H. Perlis, Ata Uslu, Jonathan Schulman, Aliayah Himelfarb, Faith M. Gunning, Nili Solomonov, Mauricio Santillana, Matthew A. Baum, James N. Druckman, Katherine Ognyanova, David Lazer

Neuropsychopharmacology

Abstract

+

This study aimed to characterize the prevalence of irritability among U.S. adults, and the extent to which it co-occurs with major depressive and anxious symptoms. A non-probability internet survey of individuals 18 and older in 50 U.S. states and the District of Columbia was conducted between November 2, 2023, and January 8, 2024. Regression models with survey weighting were used to examine associations between the Brief Irritability Test (BITe5) and sociodemographic and clinical features. The survey cohort included 42,739 individuals, mean age 46.0 (SD 17.0) years; 25,001 (58.5%) identified as women, 17,281 (40.4%) as men, and 457 (1.1%) as nonbinary. A total of 1218(2.8%) identified as Asian American, 5971 (14.0%) as Black, 5348 (12.5%) as Hispanic, 1775 (4.2%) as another race, and 28,427 (66.5%) as white. Mean irritability score was 13.6 (SD 5.6) on a scale from 5 to 30. In linear regression models, irritability was greater among respondents who were female, younger, had lower levels of education, and lower household income. Greater irritability was associated with likelihood of thoughts of suicide in logistic regression models adjusted for sociodemographic features (OR 1.23, 95% CI 1.22–1.24). Among 1979 individuals without thoughts of suicide on the initial survey assessed for such thoughts on a subsequent survey, greater irritability was also associated with greater likelihood of thoughts of suicide being present (adjusted OR 1.17, 95% CI 1.12–1.23). The prevalence of irritability and its association with thoughts of suicide suggests the need to better understand its implications among adults outside of acute mood episodes.

June 5, 2024

PDF
Stefan D. McCabe, Diogo Ferrari, Jon Green, David M. J. Lazer, Kevin M. Esterling

Abstract

+

The social media platforms of the twenty-first century have an enormous role in regulating speech in the USA and worldwide1. However, there has been little research on platform-wide interventions on speech2,3. Here we evaluate the effect of the decision by Twitter to suddenly deplatform 70,000 misinformation traffickers in response to the violence at the US Capitol on 6 January 2021 (a series of events commonly known as and referred to here as ‘January 6th’). Using a panel of more than 500,000 active Twitter users4,5 and natural experimental designs6,7, we evaluate the effects of this intervention on the circulation of misinformation on Twitter. We show that the intervention reduced circulation of misinformation by the deplatformed users as well as by those who followed the deplatformed users, though we cannot identify the magnitude of the causal estimates owing to the co-occurrence of the deplatforming intervention with the events surrounding January 6th. We also find that many of the misinformation traffickers who were not deplatformed left Twitter following the intervention. The results inform the historical record surrounding the insurrection, a momentous event in US history, and indicate the capacity of social media platforms to control the circulation of misinformation, and more generally to regulate public discourse.

June 1, 2024

PDF
Alvaro Feal, Jeffrey Gleason, Pranav Goel, Jason Radford, Kai-Cheng Yang, John Basl, Michelle Meyer, David Choffnes, Christo Wilson, David Lazer

ICWSM Workshops

Abstract

+

The National Internet Observatory (NIO) aims to help researchers study online behavior. Participants install a browser extension and/or mobile apps to donate their online activity data along with comprehensive survey responses. The infrastructure will offer approved researchers access to a suite of structured, parsed content data for selected domains to enable analyses and understanding of Internet use in the US. This is all conducted within a robust research ethics framework, emphasizing ongoing informed consent and multiple layers, technical and legal, of interventions to protect the values at stake in data collection, data access, and research. This paper provides a brief overview of the NIO infrastructure, the data collected, the participants, and the researcher intake process.

May 29, 2024

PDF
Kai-Cheng Yang, Filippo Menczer

Journal of Quantitative Description: Digital Media

Abstract

+

Large language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.

February 15, 2024

PDF
Matthew David Simonson, Ray Block Jr, James N. Druckman, Katherine Ognyanova, David M. J. Lazer

Cambridge University Press

Abstract

+

Scholars have long recognized that interpersonal networks play a role in mobilizing social movements. Yet, many questions remain. This Element addresses these questions by theorizing about three dimensions of ties: emotionally strong or weak, movement insider or outsider, and ingroup or cross-cleavage. The survey data on the 2020 Black Lives Matter protests show that weak and cross-cleavage ties among outsiders enabled the movement to evolve from a small provocation into a massive national mobilization. In particular, the authors find that Black people mobilized one another through social media and spurred their non-Black friends to protest by sharing their personal encounters with racism. These results depart from the established literature regarding the civil rights movement that emphasizes strong, movement-internal, and racially homogenous ties. The networks that mobilize appear to have changed in the social media era. This title is also available as Open Access on Cambridge Core.

February 2, 2024

PDF
Sarah Shugars, Alexi Quintana-Mathé, Robin Lange, David Lazer

Journal of Computer-Mediated Communication

Abstract

+

Studies of gendered phenomena online have highlighted important disparities, such as who is likely to be elevated as an expert or face gender-based harassment. This research, however, typically relies upon inferring user gender—an act that perpetuates notions of gender as an easily observable, binary construct. Motivated by work in gender and queer studies, we therefore compare common approaches to gender inference in the context of online settings. We demonstrate that gender inference can have downstream consequences when studying gender inequities and find that nonbinary users are consistently likely to be misgendered or overlooked in analysis. In bringing a theoretical focus to this common methodological task, our contribution is in problematizing common measures of gender, encouraging researchers to think critically about what these constructs can and cannot capture, and calling for more research explicitly focused on gendered experiences beyond a binary.

September 22, 2023

PDF
Pranav Goel, Nikolay Malkin, SoRelle W. Gaynor, Nebojsa Jojic, Kristina Miler, Philip Resnik

Abstract

+

Campaign contributions are a staple of congressional life. Yet, the search for tangible effects of congressional donations often focuses on the association between contributions and votes on congressional bills. We present an alternative approach by considering the relationship between money and legislators’ speech. Floor speeches are an important component of congressional behavior, and reflect a legislator’s policy priorities and positions in a way that voting cannot. Our research provides the first comprehensive analysis of the association between a legislator’s campaign donors and the policy issues they prioritize with congressional speech. Ultimately, we find a robust relationship between donors and speech, indicating a more pervasive role of money in politics than previously assumed. We use a machine learning framework on a new dataset that brings together legislator metadata for all representatives in the US House between 1995 and 2018, including committee assignments, legislative speech, donation records, and information about Political Action Committees. We compare information about donations against other potential explanatory variables, such as party affiliation, home state, and committee assignments, and find that donors consistently have the strongest association with legislators’ issue-attention. We further contribute a procedure for identifying speech and donation events that occur in close proximity to one another and share meaningful connections, identifying the proverbial needles in the haystack of speech and donation activity in Congress which may be cases of interest for investigative journalism. Taken together, our framework, data, and findings can help increase the transparency of the role of money in politics.

September 5, 2023

PDF
Roy H Perlis, Kristin Lunz Trujillo, Alauna Safarpour, Alexi Quintana, Matthew D Simonson, Jasper Perlis, Mauricio Santillana, Katherine Ognyanova, Matthew A Baum, James N Druckman, David Lazer

Abstract

+

Importance: Marked elevation in levels of depressive symptoms compared with historical norms have been described during the COVID-19 pandemic, and understanding the extent to which these are associated with diminished in-person social interaction could inform public health planning for future pandemics or other disasters.

Objective: To describe the association between living in a US county with diminished mobility during the COVID-19 pandemic and self-reported depressive symptoms, while accounting for potential local and state-level confounding factors.

August 12, 2023

PDF
Pranav Goel, Jon Green, David Lazer, Philip Resnik

International AAAI Conference on Web and Social Media (ICWSM) 2023

Abstract

+

Most prior and current research examining misinformation spread on social media focuses on reports published by 'fake' news sources. These approaches fail to capture another potential form of misinformation with a much larger audience: factual news from mainstream sources ('real' news) repurposed to promote false or misleading narratives. We operationalize narratives using an existing unsupervised NLP technique and examine the narratives present in misinformation content. We find that certain articles from reliable outlets are shared by a disproportionate number of users who also shared fake news on Twitter. We consider these 'real' news articles to be co-shared with fake news. We show that co-shared articles contain existing misinformation narratives at a significantly higher rate than articles from the same reliable outlets that are not co-shared with fake news. This holds true even when articles are chosen following strict criteria of reliability for the outlets and after accounting for the alternative explanation of partisan curation of articles. For example, we observe that a recent article published by The Washington Post titled "Vaccinated people now make up a majority of COVID deaths" was disproportionately shared by Twitter users with a history of sharing anti-vaccine false news reports. Our findings suggest a strategic repurposing of mainstream news by conveyors of misinformation as a way to enhance the reach and persuasiveness of misleading narratives. We also conduct a comprehensive case study to help highlight how such repurposing can happen on Twitter as a consequence of the inclusion of particular narratives in the framing of mainstream news.