Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

1

Zach Quinn

QTube: Twitter Sentiment Analysis for Q Anon-related Hashtags Using NLP

Introduction

Although disinformation is a rather abstract concept, often finding its ways into

information transparency campaigns mounted by platforms like Facebook, disinformation-laced

content costs businesses 78 billion dollars per year. Filtering and ultimately removing

disinformation remains a problem for platforms, businesses and content creators who delicately

balance protected speech with its industry and socially damaging impact. Those most commonly

associated with spreading misinformation are the Q Anons, followers of the mythical and

anonymous military intelligence expert Q and how followers ‘dig’ to discover and ultimately

disseminate facts that contribute to the epidemic of misinformation. While other textual analyses

have focused on the words of ‘Q’ themselves in the 8Chan hosted Q drops, this project employs

natural language processing techniques to determine the thoughts, motivations and connections

between believers of the Q phenomenon and trending topics from the period of January 6th, the

siege of the Capitol to present.

Methodology

The data mined for this project constitutes a dynamic novel data set since it is derived

from Twitter via the platform’s developer API. In order to ensure that the information is relevant,

the request was constrained to January 6th, 2021, providing nearly four months’ of tweets for the

selected hash tags. The key words and hash tags were chosen based upon existing journalistic

domain knowledge as well as existing analytic reports. The phrases selected included ‘QAnon’,

‘Save The Children’ and ‘Deep State.’ After querying the API, raw text was stripped from each
2

returned tweet and converted to a corpus, which divided each tweet into an individual text file.

Next, the data was vetorized, or split into individual words, using a Term Document Matrix.

Since this project was primarily interested in association, the key words themselves were filtered

out of the three queries, along with English stop words (normally occurring phrases like ‘the’,

‘can’ and ‘like.’) and terms that were irrelevant to the search, i.e. ‘Unicef’ for a query concerning

‘Save The Children.’ Each term’s frequency was plotted on histograms and word clouds, graphic

tools ideal for displaying qualitative data. The subsequent phase of the project involved obtaining

the sentiment scores derived from the NRC Emotion Lexicon dictionary. Finally, several of the

most frequently occurring and highly correlated words were compared using a native association

function.

Results

The ‘QAnon’ query returned results, visualized with a word cloud and histogram, that

were consistent with existing Q belief systems as well as ongoing political and cultural

narratives. Specifically note the correlation between ‘republican’, ‘evangelicals’ and ‘president.’

The frequency of these terms is also visualized in the histogram below.


3

Several of these terms, including ‘patriottakes’ suggest that followers are both self-actualized

and ready to eliminate so-called threats to democracy, yet also resigned to wait for ‘gestures’ or

take cues from ‘president’, which aligns with Q’s original directions.
4

This inner conflict is conveyed in the sentiment score for these tweets. Although one could

predict that QAnon tweets could be ‘negative’, the third most significant sentiment is ‘trust.’

This stands in stark contrast to tweets mentioning the phrase ‘Save the Children’, which were

largely positive, according to the NRC Emotion Lexicon score.

While the spikes in both the positive and trust categories are noteworthy observations, the fact

that ‘fear’ is so low is inconsistent with the prevailing Q narrative that societal elites are preying

upon children, which is an age-old fear mongering tactic to mobilize a susceptible population.

For the ‘Deep State’ hash tag, the greatest correlations are between the words ‘hoax’ and

‘Russia’; both key words have been featured prominently across social platforms since the 2016

election, as emphasized by the below histogram.


5

The inclusion of conspiratorial terms like ‘hoax’, ‘expect’ and ‘uncovered’ reflects the

significant fear and anticipation sentiments present in this data compared to the earlier samples.
6

Conclusion

As misinformation continues to pervade both virtual and physical conversation, it is

essential for content creators, platform hosts and savvy Internet users to understand the

connotations and significance of terms associated with false claims. For developers, data

scientists and executives, it is necessary to understand how certain hash tags that may appear

innocuous, such as ‘Save the Children’, have been co-opted to spread ideologies that might run

counter to the brand of individuals or businesses who may naively use such a hash tag or key

word in digital communication. In order to slow (and perhaps halt entirely) the spread of

misinformation, users bear a similar responsibility when choosing the hash tags they use to help

platforms identify, aggregate and promote their posts. This project has demonstrated the critical

insights that can be gleamed from even a moderate sample of tweets and the power of natural

language processing to synthesize, interpret and, perhaps, preempt the spread of baseless content

within digital communities.

You might also like