In general, which features of language distinguish propaganda from independent news?, and specifically, How can temporal, linguistic, and emotional features inform us on these differences? Answering these questions will provide a greater understanding of how to detect propaganda and understand its linguistic strategies of persuasion, a pressing issue as the online information war becomes a greater threat around the world.
I use 3 types of computational analysis to investigate my research questions. These are thematic, which I conduct with topic modeling; stylistic, using Linguistic Inquiry and Word Counts (LIWC), and a mixed-level analysis using sentiment analysis and the results from my topic modeling. Together, these 3 levels of analysis reveal different features and patterns for textual analysis.
On inspecting topic frequency over time, we find that the drop in frequency for "Military Tension" in Russian outlets is balanced out by increases in "Dialogue of Diplomacy" and "Denial of Accusations," and these topics stayed the same or decreased in Western outlets during the same time period. On qualitative inspection, this suggests that as Russia prepared for invasion before February 2022, the Russian government downplayed the military crisis during this time period by increasing rhetoric of deflection and deescalation. In turn, both Western and Russian discussion of military tension decreased.
This project involved BERT topic modeling and visualizing topics over time.