Unveiling Insights: A Comprehensive Analysis of Political Discourse on Reddit (2023)


The 2016 Presidential Election was a watershed moment, marked by unprecedented events and outcomes that left many bewildered. In the aftermath, questions lingered about the transformation of American perspectives and the failure of traditional predictors. In an unconventional quest for answers, we delved into the intricate world of Reddit, a social platform with millions of daily American users. This article unveils our findings, providing a unique lens into the dynamics of political discourse, opinions, and affiliations on Reddit.

Unraveling the Enigma: The Data Approach

Our journey began with a robust dataset, leveraging the vast expanse of Reddit comments obtained from pushshift.io. Armed with around 850 GB of data, we focused on comments since late 2011, meticulously selecting high-quality content for analysis. Our approach centered on understanding subreddit relationships through shared users, offering a novel perspective on the influence and dynamics of different communities.

Analyzing Changing Subreddits Over Time

Deciphering Subreddit Similarity

Our methodology hinges on a metric involving shared posters between subreddits. By identifying individuals who actively contribute, we gauge the influence and interconnectedness of subreddits. The resulting network graphs, crafted using Spark, offer a visual representation of the relationships between subreddits based on shared users.

Unveiling Patterns: Most Similar Subreddit Pairs

Examining the most similar subreddit pairs during November 2017 reveals intriguing insights. Subreddits like r/politics and r/gaming dominate, showcasing their popularity. Simultaneously, the shared user metric unveils the degree of similarity, with r/nutrition standing out as distinct. This nuanced analysis unveils the intricate web of user affiliations and sheds light on the diverse landscape of Reddit communities.

Shifting Tides: Evolution of Subreddit Popularity

Tracking shared subreddits with r/politics over Donald Trump's first year in office provides a dynamic view of changing preferences. Notable shifts, such as the decline of r/The_Donald's prominence and the rise of r/PoliticalHumor, mirror the evolving political landscape on Reddit.

Term-Importance Over Time: TF-IDF Analysis

Unveiling Political Hot Topics

To delve deeper into the essence of political discussions, we employed TF-IDF analysis. By extracting politically relevant terms, we created a list of top words reflecting the zeitgeist of political discourse on Reddit. The TF-IDF approach allows us to identify significant keywords that encapsulate the essence of discussions over time.

Probing Political Pulse: Examples of Keyword Analysis

Examining specific keywords such as 'debate' in r/politics unveils spikes correlating with major election debates. This granular analysis offers a real-time pulse of political discussions, providing a chronological map of key moments.

Unraveling Complexities: Caveats and Rebuttals

Acknowledging the intricacies of our approach, we address potential concerns. The Reddit user demographic skew and the shared poster metric's limitations are considered. While Reddit may not represent the entire U.S. electorate, its diverse user base provides valuable insights into the evolving landscape of political discourse.

Conclusion: Navigating the Reddit Political Landscape

In conclusion, our exploration of Reddit's political terrain unravels a tapestry of affiliations, discussions, and evolving preferences. The amalgamation of data-driven analyses, TF-IDF insights, and shared user metrics offers a multifaceted view of the political pulse on Reddit. While our study doesn't provide definitive answers, it serves as a valuable tool for understanding the intricate interplay of opinions and affiliations in the digital political sphere.

