Professional Documents
Culture Documents
[Very IMP] a Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing- A Survey
[Very IMP] a Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing- A Survey
With the proliferation of social sensing, large amounts of observation are contributed by people or devices.
However, these observations contain disinformation. Disinformation can propagate across online social net-
works at a relatively low cost, but result in a series of major problems in our society. In this survey, we provide
a comprehensive overview of disinformation and truth discovery in social sensing under a unified perspec-
tive, including basic concepts and the taxonomy of existing methodologies. Furthermore, we summarize the
mechanism of disinformation from four different perspectives (i.e., text only, text with image/multi-modal,
text with propagation, and fusion models). In addition, we review existing solutions based on these require-
ments and compare their pros and cons and give a sort of guide to usage based on a detailed lesson learned. To
facilitate future studies in this field, we summarize related publicly accessible real-world data sets and open
source codes. Last but the most important, we emphasize potential future research topics and challenges in
this domain through a deep analysis of most recent methods.
CCS Concepts: • Artificial intelligence → Natural language processing; • Information systems → Content
analysis and feature selection;
Additional Key Words and Phrases: Disinformation detection, truth discovery, social sensing, privacy-aware
1 INTRODUCTION
People or devices can contribute large amounts of observation in social sensing, a new paradigm
[1, 127], to a perceive environment. In this new paradigm, crowdsourcing can be adopted to har-
ness the wisdom of crowds when collecting real time information [107, 165, 166]. Meanwhile, the
social sensing can be embodied in web claims, news, Twitter, Weibo, and so on. Recently, many
social sensing related applications are presented, including smart phones-based crowdsourcing 6
This research was supported by the National Natural Science Foundation of China under Grants 62162031, 61772246,
61728205, and 61876074, Joint Funding Project of Jiangxi Science and Technology Plan under Grant 20192ACBL21030.
Authors’ addresses: F. Xu and M. Wang, Jiangxi Normal University, Nanchang 330022, China; emails: {xufan, mwwang}@
jxnu.edu.cn; V. S. Sheng (corresponding author), Texas Tech University, Lubbock, TX 79409, USA; email: Victor.sheng@
ttu.edu.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from permissions@acm.org.
© 2019 Association for Computing Machinery.
0360-0300/2021/11-ART6 $15.00
https://doi.org/10.1145/3477138
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:2 F. Xu et al.
tasks [11], disaster management [154], geo-tagging for smart cities [74], green global position
system (GPS) [23] (a participatory sensing navigation service for finding the most fuel-efficient
routes), personal health monitoring [13], and so on. Similar to mobile crowdsourcing [12, 138],
social sensing has become an effective paradigm to collect and process data. Meanwhile, since
sensing devices are mobiles, they can provide a wider coverage than traditional wireless sensor
networks (WSN) do. Besides these plentiful advantages, it also has some disadvantages, espe-
cially on disinformation propagation. That is, disinformation can propagate across online social
networks at a relatively low cost, but result in a series of major problems in our society. In fact,
in the social sensing or crowdsourcing platform, individual users may have privacy issues when
sharing their answers with others. For example, individual users can report correlations between
search queries and web pages, but the answers may reveal their personal preferences, occupation,
age, state, party, and prior history. Similarly, patient responses to a drug is valuable for doctors
to discover the side effects of the drug, but it contains sensitive information that a patient may
not want to share. Therefore, the privacy-preserving truth discovery is an important and challeng-
ing task in social sensing (refer to Section 4 for more details). As we all know, disinformation is
harmful to our lives [53, 173]. Disinformation commonly spreads in the circumstances of breaking
news, often starting as a rumor. For instance, a rumor about the White House having been bombed
resulted in stock markets spooked in 2013.1 Similarly, another similar disinformation related to the
Hurricane Sandy rumor led to the US Federal Emergency Management Agency finally having to
control the rumor.2
According to statistics, two-thirds of Americans obtain news on social media.3 Humans, how-
ever, are impressionable to false information [124]. Unfortunately, Rubin et al. [100] revealed that
the human ability of detecting false information ranges 55%–58% accuracy. Although some web-
sites, e.g., Snopes,4 Politifact,5 Factcheck,6 can debunk some types of disinformation, they heavily
depend on domain experts to conduct manual fact-checking. Potential issues of manually checking
include low coverage and high latency. Therefore, automatic disinformation detection (ADT)
is necessary. However, conflicts are a general problem within multi-source information for the
same object in a social sensing environment. Fortunately, truth discovery can tackle this challenge
by integrating multi-source noisy information and result in a great success for data or knowledge
fusion [61].
Some authors have conducted a survey related to fake news detection [35, 170], false infor-
mation [48, 159], misinformation [20], and rumor [2, 7, 172]. To be more specific, Shu et al. [35]
summarized fake news from four perspectives, such as credibility of users, false knowledge, writ-
ing style of the fake news, and propagation schema. Zhou and Zafarani [170] introduced fake news
characteristics from both psychology and social theory perspective and summarized current rep-
resentative methods from a viewpoint of data mining. Kumar and Shah [48] reported an overview
of existing research from five aspects (i.e., actors who spread disinformation, rational to deceive
users, the importance of disinformation, the characteristics investigation, and related representa-
tive algorithms). In comparison, Zannettou et al. [159] focused on providing a detailed typology
(i.e., perception, motivation, propagation, detection) in the diverse types of false information, i.e.,
rumors, fake news, hoaxes, clickbait, and various other shenanigans. Fernandez and Alani [20]
focused on misinformation detection and characterized existing methods from four aspects, i.e.,
1 https://www.bbc.com/news/world-us-canada-21508660.
2 https://twitter.com/fema/status/264800761119113216.
3 http://www.journalism.org/2017/09/07/news-use-across-social-media-platforms-2017.
4 https://www.snopes.com.
5 http://www.politifact.com.
6 https://www.factcheck.org.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:3
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:4 F. Xu et al.
misinformation or disinformation according to the user’s intention. Compared with rumor, hoax
and fake news are definitely disinformation, and fake news is a specific type of hoax. Furthermore,
fake news aims at more financial profit or political harvest.
The above definitions are general concepts from the authoritative dictionaries. It is still very
hard to operate in our research. Therefore, some references give more detailed concepts for these
terminologies, e.g., disinformation, misinformation, rumor, hoax, and fake news. Based on these
terms defined in the references as shown in Table 2, we can distinguish them from two aspects,
i.e., authenticity and intention. For the former, the authenticity of disinformation, misinformation,
hoax, and fake news is false. However, the authenticity of rumor is unknown. For the latter, the
intention is generally bad for disinformation, hoax, and fake news. In comparison, it is unknown
for misinformation and rumor.
To be more specific, according to Reference [172], relationships among these terminologies
can be illustrated in Figure 1. According to their intention, rumors can be further classified into
disinformation and misinformation. Fake news is a typical type of hoax, and hoax is a kind of
disinformation.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:5
Ci = {ri , xi1 , xi2 , . . . , xim }, where each xi ∗ is a response of the root ri . Then, we formulate a dis-
information detection task as a supervised classification problem. It learns a classifier f :Ci → Yi ,
where Yi can take one of two classes: rumor or non-rumor, for binary rumor detection; and Yi can
take one of the four classes: true rumor, false rumor, non-rumor, false rumor, or unverified rumor,
for multi-class rumor detection.
Truth discovery. According to Reference [61], for a collection of objects O , a series of sources
S can contribute different information. Its goal is to predict the truth vo ∗ for each object o ∈ O
by resolving conflict information from different sources {vo s }, s ∈ S. Meanwhile, we can estimate
source weights {ws }(s ∈ S), which will be used to infer truths.
Relationship between disinformation and truth discovery. According to the above prob-
lem statements, if we map an object o ∈ O in truth discovery to a claim Ci in disinformation
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:6 F. Xu et al.
detection, then we can set up a unified perspective for them. From the problem statement perspec-
tive, it seems that truth discovery is suitable for structured data. In fact,
(1) According to References [30, 75, 76, 128–133, 164], it is hard to accurately ascertain both the
correctness of each unstructured tweet data and the reliability of each Twitter user. Meanwhile,
they demonstrated the maximum likelihood estimation (MLE) based on a credibility analysis
tool using Expect Maximization (EM) can be used to analyze the credibility of reported tweets.
Also, the latest literature [75] shows that the current widely used deep neural network can handle
truth discovery in social sensing environment.
(2) Besides, References [4, 24, 29, 47, 83, 96, 114, 118, 119, 126, 150] adopted a probabilis-
tic graphical model to conduct a joint interaction to find articles with high credibility, sources
with high reliability, and expert users who perform the role of “citizen journalists” in the
community.
(3) Again, References [31, 42, 75, 111, 112] adopted an iteration-based approach to conduct truth
discovery and fake news detection in social sensing.
3 REPRESENTATIVE APPROACHES
In this section, we first focus on representative algorithms for both disinformation detection and
truth discovery in social sensing under a unified perspective in Section 3.1. And then, we provide
a brief introduction of the latest privacy-aware truth discovery, which is a quite promising new
task in this research area, in Section 3.2.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:7
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:8 F. Xu et al.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:9
Table 3. Continued
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:10 F. Xu et al.
Table 3. Continued
Graph-based models
√ Fake News
Yang et al. [150] Bayesian network + Gibbs sampling
news sites
√
Xia et al. [143] Rumor Weibo
Tschiatscheket √ Fake
Facebook
et al. [119] √ Bayesian inference news
Kumar et al. [47] √
Hooi et al. [29] √ Spam Review
Beutel et al. [4]
Logistic regression;
√
Tacchini et al. [114] Boolean label Hoax Facebook
crowdsourcing
Wang et al. [126] - - Maximum likelihood estimation Rumor Weibo
√
Nguyen et al. [84] √ Fake news Weibo
Rayana et al. [96] Markov Random Field Spam Review
Mukherjee and √ Truth News
Weikum [83] discovery sites
√
Wang et al. [24] Iterative computation Spam Review
Iteration-based models
√
Shu et al. [112]
Tri-relationship optimization
√ Fake News
Shu et al. [111]
√ news sites
Kim et al. [42] Bayesian inference
√ Twitter
Jin et al. [31] Iterative deduction
Weibo
Modeling-based methods
√
Ye et al. [153] Temporal information
Spam Review
√ Burst detection in time-series
Xie et al. [144]
(Curve fitting)
Deep neural network approaches
DNN-based models
√ Truth
Marshall et al. [76] DNN for truth discovery Twitter
discovery
(Tree) RNN-based models
√
Popat et al. [88] Fake news Twitter
√ Stance News
Hanselowski et al. [27]
√ √ detection sites
Wu et al. [141] LSTM (Bi-LSTM) Twitter
√ News
Sarkar et al. [103] Fake news
sites
√ Twitter
Ruchansky et al. [101]
Weibo
√ √ News
Rashkin et al. [94]
√ sites
Yao et al. [152] Spam Reivew
(Continued)
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:11
Table 3. Continued
√
Wen et al. [140] √
Ma et al. [73] √ GRU Rumor Twitter
Rath et al. [95] √
Ma et al. [71]
Other deep learning-based methods
CNN-based models
√ News
Karimi et al. [36] Fake
sites
√ Single CNN news
Twitter
Qian et al. [93]
Weibo
√ √ Twitter
Liu et al. [66] Rumor
Weibo
√ Hybrid CNN (User description Fake News
Wang et al. [134]
with CNN) news sites
√ Twitter
Xu et al. [145] Hybrid CNN (Topic with CNN) Rumor
Weibo
Other deep learning models
√ GAN-BOW; GAN-CNN;
Ma et al. [70] Rumor Twitter
GAN-GRU
√
Przybyla [90] √ News
Fake news
Tredici et al. [117] √BERT sites
Yu et al. [156] Rumor Twitter
Ma et al. [69]
Multi-task;
Yu et al. [156]
Rumor detection: Non-rumor,
Wei et al. [139]
√ True rumor, False rumor, Rumor
Wu et al. [142] Twitter
Unverified rumor. Stance
Cheng et al. [10]
Stance detection: Support, Deny,
Chen et al. [9]
Questions, Comment.
Li et al. [59]
Hybrid approaches (Traditional & deep neural network)
√ Support Vector Machines; Logistic
Shu et al. [110] News
Regression; Naive Bayes; CNN; LSTM Fake news
sites
MaxEntropy; Random forest; LSTM;
√ CNN
Volkova et al. [122]
Features: Content, Style, Syntax,
Connotations, etc.
√ √
Volkova et al. [123] LSTM/CNN+Linguistic Cues Spam Twitter
(B. Indicates Binary; M. Donates Multiclass).
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:12 F. Xu et al.
Term Description
P(Cj = 1) the probability that the measured variable Cj is true
the probability that source Si reports an observation of a measured variable.
si It can be computed as the fraction of measured variables reported by Si over
the whole variables
d the probability that a randomly selected variable is true
ti the source reliability, which is often not known a priori
the probability that a source Si reports a measured variable to be true
ai
under the condition of it is true
the probability that a source Si reports a measured variable to be true
bi
under the condition of it is false.
as Twitter users, who tweet during observation. The measured variables are embodied by tweets
clustering, which represent observations about the same topic of events. The social sensing data
people observed can be compactly represented by a sensing matrix SC, where Si Cj =1 indicates
that Si reports Cj to be true, and Si Cj =0 otherwise. Given observed data (i.e., a sensing matrix SC),
what is the likelihood of a specific source to make a correct observation and what is the correct
state of each measured variable? Some specific term definitions are shown in Table 4, which will
be used in the truth discovery problem formulation.
(4) Graph-based models. The key idea of graph-based models is to create some latent variables
and adopt hyper parameters to incorporate prior knowledge into graph model. Generally, the prior
knowledge can be embodied with the distribution of truth and source weight. We here only list
a few representative literatures, such as Mukherjee and Weikum [83], Yang et al. [150], Tacchini
et al. [114], Hooi et al. [29], Kumar et al. [47], Beutel et al. [4], Tschiatschek et al. [119], Wang et al.
[24], Rayana et al. [96], Tripathy et al. [118], and Wang et al. [126].
For example, Yang et al. [150] treat news truths and the credibility of user as two latent ran-
dom variables and identify users’ opinion based on the new’s authenticity. To solve the problem,
they proposed a Gibbs sampling-based method to infer the news truths and the credibility of
users. Similarly, Nguyen et al. [84] took the false information detection as a reasoning problem
in Markov random field, which was solved by using an iterative average algorithm. Again, Xia
et al. [143] considered the event states and divided the states to many sub-events and integrated
the current sub-event and the previous sub-event. Then, the combined sub-event was fed into a
time-smoothing-based model to measure the performance of early rumor detection.
More specifically, a Beta distribution with hyper parameter γ = (γ 1 , γ 2 ) is adopted to generate
the probability of i. For each user j, its credibility is modeled with ϕ j 1 and ϕ j 0 , which indicates
its true positive rate and its false positive rate, respectively. Based upon these, four variables (i.e.,
ϕ k 0,0 , ϕ k 0,1 , ϕ k 1,0 , and ϕ k 1,1 ) are adopted to model the credibility of each unverified user k ∈ K.
(5) Iteration-based models. The iteration-based models adopt the assumption that a fact will
has a relative high confidence when it is contributed by reliable sources, and a source will be
reliable if it provides many high trustworthy facts. Based upon these two heuristics, the fact confi-
dence can be inferred from source trustworthiness and vice versa. Several representative methods
repeatedly update them until achieving a stable state [31, 42, 111, 112].
The latest work [112] claimed that the social context generates a tri-relationship (e.g., news
pieces, publishers, and users) that is effective to debunk fake news. They proposed a tri-relationship
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:13
embedding framework TriFN, as shown in Figure 3, to model the interaction between pub-
lisher/user and news simultaneously.7
(6) Modeling-based methods. Currently, modeling-based methods investigated the propaga-
tion patterns of true and fake news. For example, Xie et al. [144] adopted the curve fitting tech-
nology to detect spam reviews. They employed burst detection to create Correlated Abnormal
Patterns Detection in Multidimensional Time Series (CAPT-MDTS) in time-series. Similarly,
Ye et al. [153] investigated the stream data situation and bucketed them into window sequences
with temporal information. They proposed an algorithm to identify anomalies in the time-series
signal. In general, we provide a briefly summarization of each representative traditional approach,
including its general methodology, which learning algorithm utilized, and its corresponding social
platform in Table 3.
3.1.2 Neural Network Approaches for Disinformation Detection. In this section, we further intro-
duce the representative deep learning-based models for disinformation detection from infrastruc-
ture (i.e., DNN, CNN, RNN, LSTM, BERT, etc.) and mechanism perspectives (text with image/multi-
modal, text with propagation, and fusion models).
(1) From infrastructure perspective
For disinformation detection and truth discovery in social sensing, since most posts of web
claims are written in the form of natural language, how to represent the word sequence to a dis-
tributed representation is an interesting research direction. Deep neural network can automatically
extract high-level abstractive features from web claims. Therefore, Deep Neural Network (DNN)
[75, 140] is a natural way to handle disinformation detection and truth discovery.
Due to disinformation are mostly in the form of text, the temporal sequence of text can be
successfully capture by Recurrent Neural Network (RNN) [27, 66, 71, 73, 88, 94, 95, 101, 103,
134, 141, 152].
By contrast, some researchers [25, 36, 66, 93, 103, 122, 134, 145] adopted Convolutional Neu-
ral Networks (CNN) to capture local features effectively when detecting disinformation. Recently,
Xu et al. [145] focused on the topic-driven rumor detecton only on source microblogs. They an-
notated a fine-grained 16 topics (i.e., recreational sports, social politics, diffusion forwarding, etc.)
into current popular Twitter and Weibo datasets and conducted topic distribution classification
7 To improve the readability when printed in black-and-white, we have changed the previous colorful picture to black-and-
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:14 F. Xu et al.
on source microblogs, and then they successfully incorporated the predicted topic vector of the
source microblogs into rumor detection.
Currently, Przybyla [90] and Yu et al. [156] adopted a pre-trained BERT (Bi-directional
Encoder Representations from Transformers) model to conduct fake news or rumor detection.
Generally, the pre-trained BERT model is good at computational efficiencies and representation
ability.
Most of them use cross entropy as the loss function. These models can be implemented by using
popular deep learning tools such as Tensorflow, Keras, Theno, and pyTorch .
Recently, Ma et al. [69] adopted a multi-task learning framework to conduct rumor detection
and stance detection simultaneously. They focused on fine-grained rumor detection and stance
detection. The categories of their rumor detection include false rumor, non-rumor, true rumor, and
unverified rumor; and the categories of their stance detection include support, deny, question, and
comment. Due to the corresponding stance of a web claim is a good indicator for rumor detection,
they obtained a good detection performance on both tasks. Similarly, Yu et al. [156] adopted a
BERT model to train authenticity and stance, and the hidden layer representation of each sub
thread is connected in transformer to capture the global interaction between posts. Furthermore,
Wei et al. [139] constructed a time series and interaction diagram of the post, which is represented
by a GRU. Then, a GCN (Graph Convolutional Network) was adopted to train a single task,
and the two vectors for each task are spliced in time order to train the authenticity of rumors. In
Reference [142], a transformer was adopted to extract private and shared parameters, and multi-
attention mechanism was also integrated. They selected useful shared features through a gate
in GRU. Besides, Ma et al. [70] proposed a GAN (Generative Adversarial Networks)-based
model, as shown in Equation (4). Compared with traditional data-driven approaches, their model
can capture stronger non-trivial patterns via GAN. Due to space limitation, we only present the
most representative deep learning methods in Table 3, including their general methodology, which
learning algorithm utilized, and their corresponding social platforms.
1
max min α − ||ȳ − ŷ||2 + λ||ΘD ||2 + (1 − α ) ||x t − x t ||2
2 2
, (4)
ΘD ΘG T
where ȳ and ŷ are, respectively, the ground-truth and predicted class probability distributions. ΘD
is discriminator parameter, ΘG is generator parameter, λ is the tradeoff coefficient, α is the coeffi-
cient variable, xt and xt are the tth unit in the original and reconstructed sequences, respectively,
T is the length of a sequence, and ||.||2 represents the L2 -norm of a given vector.
There are some hybrid models that integrate neural network models with traditional feature-
based models to conduct disinformation detection [110, 122, 123]. Again, we provide a brief sum-
marization of three representatives in Table 3.
(2) From mechanism perspective
Text with image/multi-modal: The above models shown in Table 3 are text-driven meth-
ods. According to Reference [32], more than 51.60% of microblogs have pictures. On average, the
forwarding amount of microblogs with pictures is 11 times that of microblogs without pictures.
The massive amount of disinformation in the Internet attracts users’ attention by relying on a
large number of false pictures, which shows that pictures play an important role in the disinfor-
mation detection. Impressively, Reference [32] was the first attempt that systematically explores
image features on the news verification task. They adopted VGG (Visual Geometry Group)-19
to extract semantic features from images (i.e., visual clarity score, visual coherence score, visual
similarity distribution histogram, visual diversity score, and visual clustering score) and employed
LSTM along with attention to encode text. Furthermore, Qi et al. [92] adopted a frequency domain
subnetwork to capture the physical features of a fake news image and employed a pixel domain
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:15
subnetwork to capture the semantic features of the fake news image. Then, they fused the above
two sub networks dynamically along with semantic information to finally construct the fake news
detection. In Reference [40], they employed LSTM to concatenate the learned text features and
image features in the encoder side, and reconstructed the original image and text from the hidden
layer vector in the decoder side. Similar work can be found in References [113, 115, 121].
Text with propagation: Generally, propagation is another key factor to conduct disinformation
detection, because the interaction between a source microblog and its subsequent reactions is a
good indicator to judge disinformation. In Reference [46], they proposed a multi-task learning
method to conduct rumor and stance detection simultaneously. In their work, they built a binary-
style tree to construct communication relationships between a source post and its subsequent
reactions by using a tree LSTM model. Again, Bian et al. [6] built two propagation graphs (i.e.,
top-down and bottom-up) by using the GCN framework to conduct rumor detection. Furthermore,
Ren et al. [97] built a heterogeneous graph along with a hierarchical attention mechanism for
representation learning of fake news. They also adopted a GAN method to augment training data.
Similar work can be found in References [41, 57, 68, 151].
Fusion models: Obviously, how to fuse text, propagation, and previous user information is a
big step in disinformation detection. In Reference [157], the local semantic and global communi-
cation information were jointly encoded, and the text representation along with user information
encoding was learned by using multi-head attention. They finally fused these representations to
conduct rumor detection. In contrast, Lu et al. [67] stimulated the potential user interaction by
using graph network structure, and they also constructed a collaborative attention mechanism
to build an explainable model for fake news detection. Similar work can be found in References
[99, 158].
where xm ∗ is the estimated ground truths, xm k is the observation value of user k, and K is the
whole number of users.
In social sensing environments, however, most information is generated by crowdsourcing. The
potential issue of crowdsourcing is the privacy issue. For example, are the clouds trustworthy?
Will my data be disclosed to other participants? Can others know my reliability degree? Is the
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:16 F. Xu et al.
communication channel secure? On the one hand, we may generate some sensitive personal infor-
mation on the web, such as health data of patients, locations of participants, answers for special
questions, and so on. On the other hand, the reliability degree of a user is also sensitive. For exam-
ple, inferring personal information and maliciously manipulating data price also exist in our real
world. The key idea of the privacy-aware truth discovery is to protect the process of both weight
update and truth estimation [62] and [148]. We briefly summarize representative works in this
area as below.
(1) Homomorphic cryptosystem-based models. Miao et al. [77] proposed a MapReduce-
driven parallel threshold Paillier cryptosystem-based model for privacy-aware truth discovery,
which consists of three components (i.e., secure sum protocol, weight update, and truth
estimation).
Due to privacy concerns, the plaintext should be encrypted before sending to a cloud server.
Therefore, they employed threshold Paillier cryptosystem [14] to design a secure sum protocol.
More specifically, the sum protocol will calculate the summation from each user data with the en-
crypted format. Although the server know the summation value, it still cannot infer the individual
data of each user.
The secure weight update component needs many steps to update weights. That is, server will
send the ground truth value to each user; each user will encrypt their data and send them to cloud
server; the cloud server will employ the proposed secure sum protocol to calculate the sum of our
data and send the average value to each user; the left process is similar to the aforementioned steps.
Meanwhile, the secure truth estimation needs many steps. That is, server sends the encrypted
weight to each user, and each user calculates the ciphertexts accordingly, followed by the server
updating truth.
Similarly, Xu et al. [146, 147] adopted an additive homomorphic cryptosystem to design privacy-
aware truth discovery. They proposed a super-increasing based sequence to model the input se-
quence. The secure weight update and truth estimation are similar to Reference [77]. To improve
the efficiency, Miao et al. [78], Zhang et al. [162, 163], and Zheng et al. [168] designed a two-cloud
server-based model. Their main algorithms, however, are similar to the above steps. They adopted
two clouds to conduct the interaction process with participating workers. Besides, Zhang et al.
[162, 163] also designed a fault tolerance mechanism in their algorithm.
(2) Yao’s garbled circuit-based models. Tang et al. [116] and Zheng et al. [169] adopted Yao’s
garbled circuit to conduct encryption in truth discovery. They integrated many gates (i.e., squaring
gate, the sub gate, etc.) into the Yao’s garbled circuit. More specifically, the framework in Reference
[116] consist of four steps. That is, each provider will generate random mask to hide their initial
data; each provider will send the random masks to security service provider (SSP); the SSP will
send a collection of designed garbled circuits to evaluator; the evaluator will estimate truth based
on the concealed data.
The key idea of the garbled circuit-based model in Reference [116] is the garbled circuit Cπ is
self-contained, thereby exposing no intermediate values. To obtain the approximate logarithm to a
circuit, they developed a Boolean circuit, as shown in Figure 4, to select the correct linear function
results.8 If the input is within τ i and τ i+1 (the target knots), then 1 will be generated from the AND
gate, otherwise 0. The output of the AND gate will affect the final result.
(3) Diffie-Hellman key agreement based & streaming models. Liu et al. [65] designed a
real-time privacy-preserving truth discovery (RTPT) based on Diffie-Hellman key agreement
to encrypt the information. Their secure summation aggregation includes four components (i.e.,
8 To improve the resolution of the previous picture, we have drawn a new picture with the permission of the author of that
paper.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:17
Fig. 4. The approximated logarithm of a circuit; The Comp donates the binary comparator; the
x stands
for the input of the circuit.
setup process, keys sharing, masked collection, and unmasking). In the setup process, the client and
server will exchange public parameters. The workflow of the algorithm consists of three phases,
as shown in Figure 5, e.g., initialization, secure truth estimation (steps 1–3, as shown in Figure 5),
and secure weight update (steps 4–7, as shown in Figure 5).
Specifically, in the initialization phase, each worker executes the common setup process to set
weights wi O = 1 and loss di O = 0 at the beginning. The secure truth estimation needs three steps.
That is, the sensing workers masked weighted data and weight and send them to cloud server; the
server conduct truths update; the cloud server sends back the estimated truths to each sensing
worker and end-user, respectively. The secure weight update needs four steps accordingly. That is,
each sensing worker conducts loss update; each worker will send the masked loss to cloud server;
the cloud server will calculate the sum of loss and send it back to sensing worker; each sensing
worker will update their weights accordingly.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:18 F. Xu et al.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
Table 5. Publicly Accessible Real-world Datasets
Data Label
Source URL Statistics Used in
set B. M.
Ma et al. [71],
Ma et al. [72],
Ruchansky et al. [101],
Liu et al. [66],
Ma et al. [73],
Qian et al. [93],
Twitter:
Ma et al. [69],
#users=491,229;
Ma et al. [70],
#posts=1,101,985;
Chen et al. [9],
√ http://alt.qcri.org/w̃gao/ #events=992
Ma et al. [71] Xia et al. [143],
data/rumdect.zip Weibo:
Kochkina et al. [43],
#users=2,746,818;
Qi et al. [92],
#posts=3,805,656;
Rumor Khattar et al. [40],
#events=4664
Bian et al. [6],
Khoo et al. [41],
Lu et al. [67],
Yuan et al. [157],
Yuan et al. [158],
Xu et al. [145].
Kwon et al. [72],
Ma et al. [70],
Chen et al. [10],
Yu et al. [156],
Wei et al. [139],
Cheng et al. [10],
https://figshare.com/articles/
√ Wu et al. [142],
PHEME_rumour_scheme_ #tweets=4,842;
Zubiaga et al. [174] Li et al. [59],
dataset_journalism_use_case/ #conversations=330
Nguyen et al. [84],
2068650
Kochkina et al. [43],
Kumar et al. [46],
Li et al. [57],
Ma et al. [68],
Khoo et al. [41],
Xu et al. [145].
√ https://github.com/WeimingWen/ #Real=6,225;
Wen et al. [140]
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing
CCRV #Fake=9404
(Continued)
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:19
Table 5. Continued
6:20
√ https://dataverse.harvard.edu/
#true_rumours=51;
Kwon et al. [50] dataset.xhtml?persistentId=doi
#false_rumours=60
%3A10.7910%2FDVN%2FBFGAVZ
Twitter15:
#tweets=1490;
√ https://www.dropbox.com/s/ #users=276,663
Ma et al. [72] 7ewzdrbelpmrnxu/
Twitter16:
rumdetect2017.zip?dl=0
#tweets=818;
#users=173,487
#false rumors=2601;
√ http://adapt.seiee.sjtu.edu.cn/ #normal messages=2536;
Wu et al. [38]
k̃zhu/rumor/. # users=4
√ million.
Kwon et al. [51] http://mia.kaist.ac.kr/publications/rumor #events=104
#rows=49972;
Fake news challenge
#unrelated (%) = 0.73131;
2017. Fifty of the 80 √
http://www.fakenewschallenge.org/ #discuss (%) = 0.17828;
participants made
#agree (%) = 0.0736012;
submissions.
#disagree (%) = 0.0168094.
Wang et al. [134],
√ https://www.cs.ucsb.edu/w̃illiam/ Sarkar et al. [103],
Fake news Wang et al. [134] #short statements=12.8K. Karimi et al. [36],
data/liar_dataset.zip
Yang et al. [150],
Xu et al. [145].
√ http://compsocial.github.io/ Mitra et al. [80],
CREDBANK #tweets=169 million. Mitra and
CREDBANK-data/
Gilbert [79].
#NewsTrust stories=82K; Mukherjee and
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
Mukherjee and √ http://www.mpi-inf.mpg.de/impact/
#Articles=47.6K Weikum [83],
Weikum [83] credibilityanalysis/
#Sources=5.7K Popat et al. [88].
PolitiFact:
#news articles_fake=432;
#news articles_real=624;
#users_fake=95,553;
√ https://github.com/KaiDMML/ #users_real=249,887
Shu et al. [110]
FakeNewsNet GossipCop:
#news articles_fake=5323;
#news articles_real=16817;
#users_fake=265,155;
#users_real=80.137
(Continued)
F. Xu et al.
Table 5. Continued
FakeNewsAMT:
√ http://lit.eecs.umich.edu/ #Fake=240; #Legitimate
Rosas et al. [98]
downloads.html Celebrity:
#Fake=250; #Legitimate=250
https://doi.org/10.5281/
zenodo.1239675 #articles=1627;
√ https://github.com/ #Mainstream=826;
Potthast et al. [89]
BuzzFeedNews/2016-10- #Left-wing=356;
facebook-fact-check/tree/ #Right-wing=545
master/data
#Total claims (Snopes) = 4341;
√ https://www.mpi-inf.mpg.de/ #Total claimsPolitiFact=3568;
Popat et al. [88]
dl-cred-analysis/ #Total claims (NewsTrust) = 5344;
#Total claims (SemEval) = 272.
√ https://github.com/UKPLab/
Hanselowski et al. [27] #topics=300
coling2018_fake-news-challenge
√√ #suspicious_news=174;
Volkova et al. [123] http://www.cs.jhu.edu/~svitlana/ #vrified_news=252;
#trust_news=252
√ https://github.com/selfagency/ #websites=244;
BS-detector 2017
bs-detector #posts=12,999
Ferreira and √ https://github.com/willferreira/ #claims=300;
Web claim Vlachos [21] mscproject #news_article=2,595
√ https://gitlab.com/didizlatkova/
Zlatkova et al. [171] #snopes (reuters) = 20000
fake-image-detection
Ott et al. [85], √
Review https://myleott.com/op-spam #reviews=1600
Ott et al. [86]
Wang et al. [128],
Truth #tweets>9.2 million [129], [130],
√ http://apollo.cse.nd.edu/
discovery Wang et al. [128] (Note no ground truths [131], [133],
datasets.html
on Twitter are available) Huang et al. [30],
Marshall et al. [76] .
Truth
#stocks=1000; #sources=50
discovery √
Li et al. [60] http://lunadong.com/fusionDataSets.htm #flights=1200; #sources=38
for data
#books=1263; #sources=894
fusion
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:21
6:22 F. Xu et al.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:23
Table 6. Continued
Hoax detection
√
BLC_HOAX Tacchini et al. [114] https://github.com/gabll/some-like-it-hoax
√ https://hoaxy.iuni.iu.edu/
HOAXY Shao et al. [104, 105]
http://botometer.iuni.iu.edu
Opinion
√ spam detection
OSD Rayana et al. [96] https://www.dropbox.com/sh/iqcuj0363zcj3go/
AAAvbZVR_PSNyJX8AXUXpBqea?dl=0
Incongruity
√ detection
Incongruity Yoon et al. [155] https://github.com/david-yoon/detecting-
incongruity
Truth
√ √ discovery
CRH Li et al. [58] √ √ https://cse.buffalo.edu/~jing/software.htm
SQUARE Sheshadri et al. [108] √ √ http://ir.ischool.utexas.edu/square/index.html
CEKA Zhang et al. [165] √ √ http://ceka.sourceforge.net/
DAFNA-EA Waguih et al. [125] https://github.com/daqcri/DAFNA-EA
(B. Indicates Binary; M. Donates Multiclass).
discovery. What is more, we summarize the mechanism of disinformation from text only, text with
image/multi-modal, text with propagation, and fusion models perspectives. Furthermore, we re-
view existing solutions based on these requirements and compare their pros and cons. Meanwhile,
to facilitate future studies in this field, we provided a set of open accessible real-world datasets and
open-source codes, respectively. To give a sort of guide to usage, we summarize a detailed lesson
learned below.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:24 F. Xu et al.
although some models investigate the fusion of text with image, or text with propagation, how
to perform a deep fusion for text, propagation, and picture is still an open problem. The current
popular GCN framework could be a promising infrastructure to fuse them altogether. Further-
more, the current models on multi-model disinformation detection usually neglect the judgment
of semantic consistency between a source post and its corresponding picture. They only extract
the semantic features for a picture, using a VGG network, and then concatenate the extracted
VGG features with the text-driven encoding for all the picture-text pair. The lack of picture-text
consistency judgment definitely brings noises in these models. In fact, the semantic consistency
judgment between a source post and its corresponding picture is a key step for disinformation
detection.
6.1.2 Best Performance. In terms of performance, Ma et al. [68] obtained a best macro F1-score
of 78.70% and 71.00% on Twitter and Pheme datasets, respectively, for the 4-way classification
(i.e., non-rumor, false rumor, true rumor, and unverified rumor) by using a tree transformer-based
model. Furthermore, Yang et al. [150] achieved an accuracy of 75.90%, F1-score of 77.40%, and
74.10% for the binary true or fake news, respectively, on the Liar dataset, using a Bayesian network-
based model. In addition, Ma et al. [70] obtained an accuracy of 86.30%, F1-score of 86.60% and
85.80% for the binary rumor or non-rumor, respectively, on the Twitter dataset. They also achieved
an accuracy of 78.10%, F1-score of 78.40% and 77.80% for the binary rumor or non-rumor, respec-
tively, on the Pheme dataset, using a GAN-based model. Besides, Nguyen et al. [84] achieved an
accuracy of 96.20% and macro F1-score of 97.00% for the binary rumor and non-rumor on the
Weibo dataset, using a deep Markov random fields-based model.
Currently, the Twitter, Weibo, Pheme, and Liar datasets are popular datasets in disinforma-
tion detection. Although the performance of these models on disinformation or truth discovery
is promising, their experiments were conducted on their own dataset. Some of the datasets are
not publicly available. Furthermore, there is little experience to compare existing models on more
benchmark datasets.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:25
discovery as independent tasks. However, as we mentioned before, we can unify the two tasks.
Specifically, we can unify the two tasks through EM-based, graph-based, iteration-based, and deep
neural network-based models. Since these two tasks have some potential common characteristics,
multi-task learning is feasible to integrate the two tasks simultaneously.
(4) Privacy-aware disinformation detection. Currently, all algorithms are privacy-ignorant
in disinformation detection. The user’s posts, claims, answers, however, may be full of privacy. For
example, the economic status of a user can be inferred from the trust and answers of the user. It
is risky to expose such information to an honest but curious server. It is dangerous to share such
information to servers with malicious users or running malicious programs. Therefore, it is urgent
to design privacy-aware disinformation detection algorithms, although it is full of challenges.
(5) Language-related disinformation detection. Most existing approaches focus on disinfor-
mation detection in English. However, Chinese or other language-related disinformation detection
are relatively few. In fact, Chinese is quite different from English in cohesion and coherence char-
acteristics within or across documents. For example, anaphora and default phenomena are more
common in Chinese than in English. Therefore, designing language-specific disinformation detec-
tion solutions are expected in the future.
(6) Semantic consistency judgment for picture-text pair. Current representative multi-
model approaches for disinformation detection neglect the judgment of semantic consistency be-
tween a source post and its corresponding picture. They generally extract visual-driven features
for all pictures. Therefore, they definitely bring noises and reduce the detection performance. In
fact, if we can filter the inconsistent picture from a source post in advance, then the running time
and performance should be dramatically improved accordingly.
ACKNOWLEDGMENTS
The authors would like to thank anonymous reviewers for their insightful comments on this article.
REFERENCES
[1] Charu C. Aggarwal. 2013. Managing and Mining Sensor Data. Springer. DOI:10.1109/TMC.2019.2944829
[2] Samah M. Alzanin and Aqil M. Azmi. 2018. Detecting rumors in social media: A survey. Procedia Comput. Sci. 142
(2018), 294–300. DOI:10.1016/j.procs.2018.10.495
[3] Ipek Baris, Lukas Schmelzeisen, and Steffen Staab. 2019. CLEARumor at semEval-2019 task 7: Convolving ELMo
against rumors. In Proceedings of the 13th International Workshop on Semantic Evaluation. 1105–1109. DOI:10.18653/
v1/S19-2193
[4] Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J. Smola. 2014. CoBaFi: Collaborative Bayesian filter-
ing. In Proceedings of the International Conference Companion on World Wide Web (WWW’14). 97–107. DOI:10.1145/
2566486.2568040
[5] Meghana Moorthy Bhat and Srinivasan Parthasarathy. 2020. How effectively can machines defend against machine-
generated fake news? An empirical study. In Proceedings of the 1st Workshop on Insights from Negative Results in NLP.
48–53. DOI:10.18653/v1/2020.insights-1.7
[6] Tian Bian, Xi Xiao, Tingyang Xu, Peilin Zhao, Wenbing Huang, Yu Rong, and Junzhou Huang. 2020. Rumor detection
on social media with bi-directional graph convolutional networks. In Proceedings of the 34th AAAI Conference on
Artificial Intelligence (AAAI’20). 549–556. DOI:10.1609/aaai.v34i01.5393
[7] Juan Cao, Junbo Guo, Xirong Li, Zhiwei Jin, Han Guo, and Jintao Li. 2018. Automatic rumor detection on microblogs:
A survey. arXiv: 1807.03505v1 (2018).
[8] Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on Twitter. In Proceedings of
the International Conference Companion on World Wide Web (WWW’11). 675–684. DOI:10.1145/1963405.1963500
[9] Lei Chen, Zhongyu Wei, Jing Li, Baohua Zhou, Qi Zhang, and Xuanjing Huang. 2020. Modeling evolution of message
interaction for rumor resolution. In Proceedings of the 28th International Conference on Computational Linguistics
(COLING’20). 6377–6387. DOI:10.18653/v1/2020.coling-main.561
[10] Mingxi Cheng, Shahin Nazarian, and Paul Bogdan. 2020. VRoC: Variational autoencoder-aided multi-task rumor
classifier based on Text. In Proceedings of the International Conference of World Wide Web (WWW’20). 2892–2898.
DOI:10.1145/3366423.3380054
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:26 F. Xu et al.
[11] Yohan Chon, Nicholas D. Lane, Fan Li, Hojung Cha, and Feng Zhao. 2012. Automatically characterizing places with
opportunistic crowdsensing using smartphones. In Proceedings of the ACM Conference on Ubiquitous Computing (Ubi-
Comp’12). 481–490. DOI:10.1145/2370216.2370288
[12] Dephine Christin, Andreas Reinhardt, Salil S. Kanhere, and Matthias Hollick. 2011. A survey on privacy in mobile
participatory sensing applications. J. Syst. Softw. 84, 11 (2011), 1928–1946. DOI:10.1016/j.jss.2011.06.073
[13] Diane J. Cook and Lawrence B. Holder. 2011. Sensor selection to support practical use of health-monitoring smart
environments. Data Mining Knowl. Discov. 1, 4 (2011), 339–351. DOI:10.1002/widm.20
[14] Ronald Cramer, Ivan Damgrdå, and Jesper B. Nielsen. 2001. Multiparty computation from threshold homomorphic
encryption. Lect. Notes Comput. Sci. 7, 14 (2001), 280–299. DOI:10.7146/brics.v7i14.20141
[15] A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm.
J. Roy. Statist. Soc. Series B (Methodol.) (1977), 1–38. DOI:10.1111/j.2517-6161.1977.tb01600.x
[16] Nicholas DiFonzo and Prashant Bordia. 2007. Rumor, gossip and urban legends. Diogenes 54, 1 (2007), 19–35. DOI:10.
1177/0392192107073433
[17] Diego Esteves, Aniketh Janardhan Reddy, Piyush Chawla, and Lehmann Jens. 2018. Belittling the source: Trustwor-
thiness indicators to obfuscate fake news on the web. In Proceedings of the 1st Workshop on Fact Extraction and
Verification. DOI:10.18653/v1/W18-5508
[18] Martin Fajcik, LukaÃĆÂťs Burget, and Smrz Pavel. 2019. BUT-FIT at semEval-2019 task 7: Determining the rumour
stance with pre-trained deep bidirectional transformers. In Proceedings of the 13th International Workshop on Semantic
Evaluation. DOI:10.18653/v1/S19-2192
[19] Yang Fan, Xiaohui Yu, Liu Yang, and Yang Min. 2012. Automatic detection of rumor on Sina Weibo. In Proceedings of
the ACM SIGKDD Workshop on Mining Data Semantics. DOI:10.1145/2350190.2350203
[20] Miriam Fernandez and Harith Alani. 2018. Online misinformation: Challenges and future directions. In Proceedings
of the International Conference on World Wide Web Companion: the Web Conference Companion. 595–602. DOI:10.
1145/3184558.3188730
[21] William Ferreira and Andreas Vlachos. 2016. Emergent: A novel data-set for stance classification. In Proceedings
of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies (NAACL-HLT’16). 1163–1168. DOI:10.18653/v1/N16-1138
[22] Alvaro Figueira, Nuno Guimaraes, and Luis Torgo. 2018. Current state of the art to detect fake news in social media:
Global trendings and next challenges. In Proceedings of the International Conference on Web Information Systems and
Technologies (ICWIST’18). 332–339. DOI:10.5220/0007188503320339
[23] Raghu K. Ganti, Nam Pham, Hossein Ahmadi, Saurabh Nangia, and Tarek F. Abdelzaher. 2010. GreenGPS: A partici-
patory sensing fuel-efficient maps application. In Proceedings of the MobiSys. 151–164. DOI:10.1145/1814433.1814450
[24] Wang Guan, Sihong Xie, Liu Bing, and Philip S. Yu. 2011. Review graph based online store review spammer detection.
In Proceedings of the IEEE International Conference on Data Mining (ICDM’11). 1242–1247. DOI:10.1109/ICDM.2011.
124
[25] Maike Guderlei and Aßenmacher Matthias. 2020. Evaluating unsupervised representation learning for detecting
stances of fake news. In Proceedings of the 28th International Conference on Computational Linguistics (COLING’20).
6339–6349. DOI:10.18653/v1/2020.coling-main.558
[26] David J. Hand and Robert J. Till. 2001. A simple generalisation of the area under the ROC curve for multiple class
classification problems. Mach. Learn. 45, 2 (2001), 171–186. DOI:10.1023/A:1010920819831
[27] Andreas Hanselowski, Avinesh P. V. S., Benjamin Schiller, Felix Caspelherr, Debanjan Chaudhuri, Christian M. Meyer,
and Iryna Gurevych. 2018. A retrospective analysis of the fake news challenge stance detection task. In Proceedings
of the International Conference on Computational Linguistics (COLING’18). 1859–1874. DOI:20.08.2018--26.08.2018
[28] Peter Hernon. 1995. Disinformation and misinformation through the internet: Findings of an exploratory study.
Government Inform. Quart. 12, 2 (1995), 133–139. DOI:10.1016/0740-624X(95)90052-7
[29] Bryan Hooi, Neil Shah, Alex Beutel, Stephan Gunneman, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos
Faloutsos. 2016. BIRDNEST: Bayesian inference for ratings-fraud detection. Computer Science (2016). DOI:10.1137/1.
9781611974348.56
[30] Chao Huang, Dong Wang, and Nitesh Chawla. 2016. Towards time-sensitive truth discovery in social sensing appli-
cations. In Proceedings of the IEEE International Conference on Mobile Ad Hoc & Sensor Systems. DOI:10.1109/MASS.
2015.39
[31] Zhiwei Jin, Juan Cao, Yongdong Zhang, and Jiebo Luo. 2016. News verification by exploiting conflicting social view-
points in microblogs. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). 2972–2978.
DOI:10.5555/3016100.3016318
[32] Zhiwei Jin, Juan Cao, Yongdong Zhang, Jianshe Zhou, and Qi Tian. 2017. Novel visual and statistical image features
for microblogs news verification. IEEE Trans. Multimedia 19, 3 (2017), 1–38. DOI:10.1109/TMM.2016.2617078
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:27
[33] Nitin Jindal and Bing Liu. 2007. Analyzing and detecting review spam. In Proceedings of the IEEE International Con-
ference on Data Mining (ICDM’07). 547–552. DOI:10.1109/ICDM.2007.68
[34] Ma Jing, Gao Wei, Zhongyu Wei, Yueming Lu, and Kam Fai Wong. 2015. Detect rumors using time series of social
context information on microblogging websites. 1751–1754. DOI:10.1145/2806416.2806607
[35] Shu Kai, Suhang Wang, Amy Sliva, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data
mining perspective. ACM SIGKDD Explor. Newslett. 19, 1 (2017), 22–36. DOI:10.1145/3137597.3137600
[36] Hamid Karimi, Proteek Chandan Roy, Sari Saba Sadiya, and Jiliang Tang. 2018. Multi-source multi-class fake news
detection. In Proceedings of the International Conference on Computational Linguistics (COLING’18). 1546–1557.
[37] Hamid Karimi and Jiliang Tang. 2019. Learning hierarchical discourse-level structure for fake news detection. In
Proceedings of the International Conference on Annual Conference of the North American Chapter of the Association for
Computational Linguistics. DOI:10.18653/v1/N19-1347
[38] Wu Ke, Yang Song, and Kenny Q. Zhu. 2015. False rumors detection on Sina Weibo by propagation structures. In
Proceedings of the IEEE International Conference on Data Engineering. 651–662. DOI:10.1109/ICDE.2015.7113322
[39] Junaed Younus Khan, Md. Tawkat Islam Khondaker, Anindya Iqbal, and Sadia Afroz. 2019. A benchmark study on
machine learning methods for fake news detection. arXiv: 1905.04749v1 (2019).
[40] Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. MVAE: Multimodal variational au-
toencoder for fake news detection. In Proceedings of the International Conference of World Wide Web (WWW’19).
2915–2921. DOI:10.1145/3308558.3313552
[41] Ling Min Serena Khoo, Hai Leong Chieu, Zhong Qian, and Jing Jiang. 2020. Interpretable rumor detection in mi-
croblogs by attending to user interactions. In Proceedings of the 34th AAAI Conference on Artificial Intelligence
(AAAI’20). 8783–8790. DOI:10.1609/aaai.v34i05.6405
[42] Jooyeon Kim, Behzad Tabibian, Alice Oh, Bernhard Schoelkopf, and Manuel Gomez-Rodriguez. 2017. Leveraging the
crowd to detect and reduce the spread of fake news and misinformation. In Proceedings of the ACM International
Conference on Web Search and Data Mining (WSDM’17). DOI:10.1145/3159652.3159734
[43] Elena Kochkina and Maria Liakata. 2018. Estimating predictive uncertainty for rumour verification models. In Pro-
ceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 6964–6981. DOI:10.
18653/v1/2020.acl-main.623
[44] Elena Kochkina1, Maria Liakata, and Isabelle Augenstein. 2017. Turing at SemEval-2017 task 8: Sequential approach
to rumour stance classification with branch-LSTM. In Proceedings of the 11th International Workshop on Semantic
Evaluations (SemEval’2017). 475–480. DOI:10.18653/v1/S17-2083
[45] Nir Kshetri and Jeffrey Voas. 2017. The economics of “fake news.” IT Professional 6 (2017), 8–12. DOI:10.1109/MITP.
2017.4241459
[46] Sumeet Kumar and Kathleen Carley. 2019. Tree LSTMs with convolution units to predict stance and rumor veracity in
social media conversations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
(ACL’19). 1173–1179. DOI:10.18653/v1/P19-1498
[47] Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and V. S. Subrahamanian. 2018. Rev2:
Fraudulent user prediction in rating platforms. In Proceedings of the ACM International Conference on Web Search and
Data Mining (WSDM’18). 333–341. DOI:10.1145/3159652.3159729
[48] Srijan Kumar and Neil Shah. 2018. False information on web and social media: A survey. arXiv: 1804.08559v1 (2018).
[49] Srijan Kumar, Robert West, and Jure Leskovec. 2016. Disinformation on the web: Impact, characteristics, and detec-
tion of Wikipedia hoaxes. In Proceedings of the International Conference Companion on World Wide Web (WWW’16).
591–602. DOI:10.1145/2872427.2883085
[50] S. Kwon, M. Cha, and K. Jung. 2017. Rumor detection over varying time windows. PLoS One 12, 1 (2017), 1–19.
DOI:10.1371/journal.pone.0168344
[51] Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Chen Wei, and Yajun Wang. 2013. Prominent features of rumor prop-
agation in online social media. In Proceedings of the IEEE International Conference on Data Mining (ICDM’13). 1103–
1108. DOI:10.1109/ICDM.2013.61
[52] Thai Le, Suhang Wang, and Dongwon Lee. 2020. MALCOM: Generating malicious comments to attack neural fake
news detection models. In Proceedings of the IEEE International Conference on Data Mining (ICDM’20). 282–291.
DOI:10.1109/ICDM50108.2020.00037
[53] Newton Lee. 2013. Misinformation and Disinformation. Springer. DOI:10.1007/978-1-4614-5308-6
[54] Fangtao Li, Minlie Huang, Yang Yi, and Xiaoyan Zhu. 2011. Learning to identify review spam. In Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI’11). 2488–2493. DOI:10.5555/2283696.2283811
[55] Huayi Li, Zhiyuan Chen, Arjun Mukherjee, Bing Liu, and Jidong Shao. 2015. Analyzing and detecting opinion spam
on a large-scale dataset via temporal and spatial patterns. In Proceedings of the International AAAI Conference on Web
and Social Media. 634–637.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:28 F. Xu et al.
[56] Jiwei Li, Myle Ott, Claire Cardie, and Eduard Hovy. 2014. Towards a general rule for identifying deceptive opinion
spam. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’14). 1566–1576.
DOI:10.3115/v1/P14-1147
[57] Jiawen Li, Yudianto Sujana, and Hung-Yu Kao. 2020. Exploiting microblog conversation structures to detect rumors.
In Proceedings of the 28th International Conference on Computational Linguistics (COLING’20). 5420–5429. DOI:10.
18653/v1/2020.coling-main.473
[58] Qi Li, Yaliang Li, Jing Gao, Bo Zhao, Wei Fan, and Jiawei Han. 2014. Resolving conflicts in heterogeneous data by
truth discovery and source reliability estimation. In Proceedings of the 28th International Conference on Computational
Linguistics (SIGMOD’14). 1187–1198. DOI:10.1145/2588555.2610509
[59] Quanzhi Li, Qiong Zhang, and Luo Si. 2019. Rumor detection by exploiting user credibility information, attention
and multi-task learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
(ACL’19). 5047–5058. DOI:10.18653/v1/P19-1113
[60] Xian Li, Xin Luna Dong, Kenneth Lyons, Weiyi Meng, and Divesh Srivastava. 2013. Truth finding on the deep web:
Is the problem solved? In Proceedings of the International Conference on Very Large Data Bases (VLDB’13). 97–102.
DOI:10.14778/2535568.2448943
[61] Yaliang Li, Gao Jing, Chuishi Meng, Li Qi, and Jiawei Han. 2015. A survey on truth discovery. ACM IGKDD Explor.
Newslett. 17, 2 (2015), 1–16. DOI:10.1145/2897350.2897352
[62] Yaliang Li, Chenglin Miao, Lu Su, Jing Gao, Qi Li, Bolin Ding, Zhan Qin, and Kui Ren. 2018. An efficient two-layer
mechanism for privacy-preserving truth discovery. In Proceedings of the 24th ACM SIGKDD International Conference
on Knowledge Discovery & Data Mining (KDD’18). 1705–1714. DOI:10.1145/3219819.3219998
[63] Yuming Lin, Zhu Tao, Xiaoling Wang, Jingwei Zhang, and Aoying Zhou. 2014. Towards online review spam detection.
In Proceedings of the International Conference Companion on World Wide Web (WWW’14). 341–342. DOI:10.1145/
2567948.2577293
[64] Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Rui Fang, and Sameena Shah. 2015. Real-time rumor debunking on
Twitter. 1867–1870. DOI:10.1145/2806416.2806651
[65] Yuxian Liu, Shaohua Tang, Hao-Tian Wu, and Xinglin Zhang. 2019. RTPT: A framework for real-time privacy-
preserving truth discovery on crowdsensed data streams. Comput. Netw. 148 (2019), 349–360. DOI:10.1016/j.comnet.
2018.11.018
[66] Yang Liu and Yi-Fang Brook Wu. 2018. Early detection of fake news on social media through propagation path
classification with recurrent and convolutional networks. In Proceedings of the 30th AAAI Conference on Artificial
Intelligence (AAAI’18). 354–361. DOI:10.1145/2806416.2806651
[67] Yi-Ju Lu and Cheng-Te Li. 2020. GCAN: Graph-aware co-attention networks for explainable fake news detection
on social media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20).
505–514. DOI:10.18653/v1/2020.acl-main.48
[68] Jing Ma and Wei Gao. 2020. Debunking rumors on Twitter with tree transformer. In Proceedings of the 28th Interna-
tional Conference on Computational Linguistics (COLING’20). 5455–5466. DOI:10.18653/v1/2020.coling-main.476
[69] Jing Ma, Wei Gao, and Wong Kam-Fai. 2018. Detect rumor and stance jointly by neural multi-task learning. In Pro-
ceedings of the International Conference Companion on World Wide Web (WWW’18). 585–593. DOI:10.1145/3184558.
3188729
[70] Jing Ma, Wei Gao, and Wong Kam-Fai. 2019. Detect rumors on Twitter by promoting information campaigns with gen-
erative adversarial learning. In Proceedings of the International Conference Companion on World Wide Web (WWW’19).
3049–3055. DOI:10.1145/3308558.3313741
[71] Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J. Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. De-
tecting rumors from microblogs with recurrent neural networks. In Proceedings of the International Joint Conference
on Artificial Intelligence (IJCAI’16). 3818–3824. DOI:10.5555/3061053.3061153
[72] Jing Ma, Wei Gao, and Kam-Fai Wong. 2017. Detect rumors in microblog posts using propagation structure via kernel
learning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’17). 708–717.
DOI:10.18653/v1/P17-1066
[73] Jing Ma, Wei Gao, and Kam-Fai Wong. 2018. Rumor detection on Twitter with tree-structured recursive neural net-
works. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’18). 1980–1989.
DOI:10.18653/v1/P17-1066
[74] Nicolas Maisonneuve, Matthias Stevens, Maria E. Niessen, Peter Hanappe, and Luc Steels. 2009. Citizen noise pollu-
tion monitoring. In Proceedings of the 10th Annual International Conference on Digital Government Research, Partner-
ships for Public Innovation. 96–103. DOI:10.1145/1556176.1556198
[75] Jermaine Marshall, Arturo Argueta, and Dong Wang. 2017. A neural network approach for truth discovery in social
sensing. In Proceedings of the IEEE International Conference on Mobile Ad Hoc & Sensor Systems. 343–347. DOI:10.
1109/MASS.2017.26
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:29
[76] Jermaine Marshall, Munira Syed, and Dong Wang. 2016. Hardness-aware truth discovery in social sensing ap-
plications. In Proceedings of the International Conference on Distributed Computing in Sensor Systems. 143–152.
DOI:10.1109/DCOSS.2016.9
[77] Chenglin Miao, Wenjun Jiang, Lu Su, Yaliang Li, Suxin Guo, Zhan Qin, Houping Xiao, Jing Gao, and Kui Ren. 2015.
Cloud-enabled privacy-preserving truth discovery in crowd sensing systems. In Proceedings of the ACM Conference
on Embedded Networked Sensor Systems. 183–196. DOI:10.1145/2809695.2809719
[78] Chenglin Miao, Su Lu, Wenjun Jiang, Yaliang Li, Miaomiao Tian, Chenglin Miao, Su Lu, Wenjun Jiang, Yaliang
Li, and Miaomiao Tian. 2017. A lightweight privacy-preserving truth discovery framework for mobile crowd
sensing systems. In Proceedings of the International Conference on Computer Communications (INFOCOM’17). 1–9.
DOI:10.1109/INFOCOM.2017.8057114
[79] Tanushree Mitra. 2015. CREDBANK: A large-scale social media corpus with associated credibility annotations. In
Proceedings of the 9th International AAAI Conference on Web and Social Media (ICWSM’15). 258–267.
[80] Tanushree Mitra, Graham P. Wright, and Eric Gilbert. 2017. A parsimonious language model of social media credibil-
ity across disparate events. In Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social
Computing (CSCW’17). 126–145. DOI:10.1145/2998181.2998351
[81] Federico Monti, Fabrizio Frasca, Davide Eynard, Damon Mannion, and Michael M. Bronstein. 2019. Fake news detec-
tion on social media using geometric deep learning. arXiv: 1902.06673v1 (2019).
[82] Arjun Mukherjee, Liu Bing, and Natalie Glance. 2012. Spotting fake reviewer groups in consumer reviews. In Pro-
ceedings of the International Conference Companion on World Wide Web (WWW’12). 191–200. DOI:10.1145/2187836.
2187863
[83] Subhabrata Mukherjee and Gerhard Weikum. 2015. Leveraging joint interactions for credibility analysis in news com-
munities. In Proceedings of the ACM International Conference on InfomIation and KnowIedge Management (CIKM’15).
353–362. DOI:10.1145/2806416.2806537
[84] Duc Minh Nguyen, Tien Huu Do, Robert Calderbank, and Deligiannis Nikos. 2019. Fake news detection using deep
Markov random fields. In Proceedings of the Conference of the North American Chapter of the Association for Compu-
tational Linguistics: Human Language Technologies (NAACL-HLT’19). 1391–1400. DOI:10.18653/v1/N19-1141
[85] Myle Ott, Claire Cardie, and Jeffrey T. Hancock. 2013. Negative deceptive opinion spam. In Proceedings of the An-
nual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies (NAACL-HLT’13). 497–501. DOI:10.1.1.701.8423
[86] Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch
of the imagination. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human
Language Technologies (ACL-HLT’11). 309–319. DOI:10.5555/2002472.2002512
[87] Demetris Paschalides, Chrysovalantis Christodoulou, Rafael Andreou, George Pallis, Marios D. Dikaiakos, Alexan-
dros Kornilakis, and Evangelos Markatos. 2019. Check-It: A plugin for detecting and reducing the spread of fake
news and misinformation on the web. In IEEE/WIC/ACM International Conference on Web Intelligence. 298–302.
DOI:10.1145/3350546.3352534
[88] Kashyap Popat, Subhabrat Mukherjee, Andrew Yates, and Weikum Gerhard. 2018. DeClarE: Debunking fake news
and false claims using evidence-aware deep learning. In Proceedings of the Conference on Empirical Methods in Natural
Language Processing (EMNLP’18). 22–32. DOI:10.18653/v1/D18-1003
[89] Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein. 2018. A stylometric inquiry into
hyperpartisan and fake news. In Proceedings of the Annual Meeting of the Association for Computational Linguistics
(ACL’18). 231–240. DOI:10.18653/v1/P18-1022
[90] Piotr Przybyla. 2020. Capturing the style of fake news. In Proceedings of the 34th AAAI Conference on Artificial Intel-
ligence (AAAI’20). 490–497. DOI:10.1609/aaai.v34i01.5386
[91] Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. 2011. Rumor has it: Identifying mis-
information in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing
(EMNLP’11). 1589–1599. DOI:10.5555/2145432.2145602
[92] Peng Qi, Juan Cao, Tianyun Yang, Junbo Guo, and Jintao Li. 2019. Exploiting multi-domain visual information
for fake news detection. In Proceedings of the IEEE International Conference on Data Mining (ICDM’19). 518–527.
DOI:10.1109/ICDM.2019.00062
[93] Feng Qian, Chengyue Gong, Karishma Sharma, and Yan Liu. 2018. Neural user response generator: Fake news
detection with collective user intelligence. In Proceedings of the 27th International Joint Conference on Artificial
Intelligence and the 23rd European Conference on Artificial Intelligence (IJCAI-ECAI’18). 3834–3840. DOI:10.24963/
ijcai.2018/533
[94] Hannah Rashkin, Eunsol Choi, and Jang Jin Yea. 2017. Truth of varying shades: Analyzing language in fake news
and political fact-checking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing
(EMNLP’17). 2931–2937. DOI:10.18653/v1/D17-1317
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:30 F. Xu et al.
[95] Bhavtosh Rath, Gao Wei, Ma Jing, and Jaideep Srivastava. 2017. From retweet to believability: Utilizing trust to
identify rumor spreaders on Twitter. In Proceedings of the IEEE/ACM International Conference on Advances in Social
Networks Analysis & Mining. 179–186. DOI:10.1145/3110025.3110121
[96] Shebuti Rayana and Leman Akoglu. 2015. Collective opinion spam detection: Bridging review networks and meta-
data. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’15).
985–994. DOI:10.1145/2783258.2783370
[97] Yuxiang Ren, Bo Wang, Jiawei Zhang, and Yi Chang. 2020. Adversarial active learning based heterogeneous graph
neural network for fake news detection. In Proceedings of the IEEE International Conference on Data Mining (ICDM’20).
452–461. DOI:10.1109/ICDM50108.2020.00054
[98] Pérez Verónica Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. 2018. Automatic detection of fake
news. arXiv: 1708.07104 (2018), 3391–3401 pages.
[99] Nir Rosenfeld, Aron Szanto, and David C. Parkes. 2020. A kernel of truth: Determining rumor veracity on Twitter
by diffusion pattern alone. In Proceedings of the International Conference of World Wide Web (WWW’20). 1018–1028.
DOI:10.1145/3366423.3380180
[100] Victoria L. Rubin. 2010. On deception and deception detection: Content analysis of computer-mediated stated be-
liefs. In Proceedings of the American Society for Information Science & Technology Conference (ASIST’10) 47, 1, 1–10.
DOI:10.1002/meet.14504701124
[101] Natali Ruchansky, Sungyong Seo, and Liu Yan. 2017. CSI: A hybrid deep model for fake news. In Proceedings of the
ACM International Conference on Information and Knowedge Management (CIKM’17). 797–806. DOI:10.1145/3132847.
3132877
[102] Vlad Sandulescu and Martin Ester. 2016. Detecting singleton review spammers using semantic similarity. In Pro-
ceedings of the International Conference Companion on World Wide Web (WWW’16). 971–976. DOI:10.1145/2740908.
2742570
[103] Sohan De Sarkar, Fan Yang, and Arjun Mukherjee. 2018. Attending sentences to detect satirical fake news. In Pro-
ceedings of the International Conference on Computational Linguistics (COLING’18). 3371–3380.
[104] Chengcheng Shao, Giovanni Luca Ciampaglia, Alessandro Flammini, and Filippo Menczer. 2016. Hoaxy: A platform
for tracking online misinformation. In Proceedings of the International Conference Companion on World Wide Web
(WWW’16). 745–750. DOI:10.1145/2872518.2890098
[105] Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol, Alessandro Flammini, and Filippo Menczer. 2018. The
spread of fake news by social bots. Nat. Commun. 9 (2018), 1–9. DOI:10.1038/s41467-018-06930-7
[106] John Shawe-Taylor and Nello Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press.
DOI:10.1017/CBO9780511809682
[107] Victor S. Sheng, Foster J. Provost, and Panagiotis G. Ipeirotis. 2008. Get another label? Improving data quality and
data mining using multiple, noisy labelers. In Proceedings of the ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining (KDD’08). 614–622. DOI:10.1145/1401890.1401965
[108] Aashish Sheshadri and Matthew Lease. 2013. SQUARE: A benchmark for research on computing crowd
consensus. In Proceedings of the 1st AAAI Conference on Human Computation (HCOMP’13). 156–164. DOI:10.1.1.644.
2813
[109] Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. 2019. dEFEND: Explainable fake news detection.
In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’19).
395–405. DOI:10.1145/3292500.3330935
[110] Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. FakeNewsNet: A data repository
with news content, social context and dynamic information for studying fake news on social media. Big Data 8, 3
(2020), 171–188. DOI:10.1089/big.2020.0062
[111] Kai Shu, Suhang Wang, and Huan Liu. 2018. Exploiting tri-relationship for fake news detection. In Proceedings of the
30th AAAI Conference on Artificial Intelligence (AAAI’18). 1–10.
[112] Kai Shu, Suhang Wang, and Huan Liu. 2019. Beyond news contents: The role of social context for fake news detection.
In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (WSDM’19).
312–320. DOI:10.1145/3289600.3290994
[113] Shivangi Singhal, Anubha Kabra, Mohit Sharma, Rajiv Ratn Shah, Tanmoy Chakraborty, and Ponnurangam
Kumaraguru. 2020. SpotFake+: A multimodal framework for fake news detection via transfer learning. In Proceedings
of the 34th AAAI Conference on Artificial Intelligence (AAAI’20). 13915–13916. DOI:10.1609/aaai.v34i10.7230
[114] Eugenio Tacchini, Gabriele Ballarin, Marco L. Della Vedova, Stefano Moret, and Luca de Alfaro. 2017. Some like it
hoax: Automated fake news detection in social networks. In Proceedings of the 2nd Workshop on Data Science for
Social Good. 1–15.
[115] Reuben Tan, Bryan Plummer, and Kate Saenko. 2020. Detecting cross-modal inconsistency to defend against neu-
ral fake news. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20).
2081–2106. DOI:10.18653/v1/2020.emnlp-main.163
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:31
[116] Xiaoting Tang, Cong Wang, Xingliang Yuan, and Qian Wang. 2018. Non-interative privacy-preserving truth dis-
covery in crowd sensing applications. In Proceedings of the International Conference on Computer Communications
(INFOCOM’18). 1988–1996. DOI:10.1109/INFOCOM.2018.8486371
[117] Marco Del Tredici and Fernández Raquel. 2020. Words are the window to the soul: Language-based user represen-
tations for fake news detection. In Proceedings of the 28th International Conference on Computational Linguistics
(COLING’20). 5467–5479. DOI:10.18653/v1/2020.coling-main.477
[118] Rudra M. Tripathy, Amitabha Bagchi, and Sameep Mehta. 2010. A study of rumor control strategies on social
networks. In Proceedings of the ACM International Conference on Information & Knowledge Management. 1817–1820.
DOI:10.1145/1871437.1871737
[119] Sebastian Tschiatschek, Adish Singla, and Manuel Gomez Rodriguez. 2018. Fake news detection in social networks
via crowd signals. In Proceedings of the International Conference Companion on World Wide Web (WWW’18). 517–524.
DOI:10.1145/3184558.3188722
[120] Yimin Chen, Victoria L. Rubin, Niall J. Conroy, and Sarah Cornwell. 2016. Fake news or truth? Using satirical cues
to detect potentially misleading news. In Proceedings of the Annual Conference of the North American Chapter of the
Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16). 7–17. DOI:10.18653/v1/
W16-0802
[121] Nguyen Vo and Kyumin Lee. 2020. Where are the facts? Searching for fact-checked information to alleviate the spread
of fake news. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20).
7717–7731. DOI:10.18653/v1/2020.emnlp-main.621
[122] Svitlana Volkova and Jin Yea Jang. 2018. Misleading or falsification? Inferring deceptive strategies and types in online
news and social media. In Proceedings of the International Conference Companion on World Wide Web (WWW’18).
575–583. DOI:10.1145/3184558.3188728
[123] Svitlana Volkova, Kyle Shaffer, Jin Yea Jang, and Nathan Hodas. 2017. Separating facts from fiction: Linguistic models
to classify suspicious and trusted news posts on Twitter. In Proceedings of the Annual Meeting of the Association for
Computational Linguistics (ACL’17). 647–653. DOI:10.18653/v1/P17-2102
[124] Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018),
1146–1151. DOI:10.1126/science.aap9559
[125] Dalia Attia Waguih and Laure Berti-Equille. 2014. Truth discovery algorithms: An experimental evaluation.
arXiv: 1409.6428 (2014).
[126] Biao Wang, Chen Ge, Luoyi Fu, Song Li, and Xinbing Wang. 2017. DRIMUX: Dynamic rumor influence minimiza-
tion with user experience in social networks. IEEE Transactions on Knowledge and Data Engineering 29, 10 (2017),
2168–2181. DOI:10.1109/TKDE.2017.2728064
[127] Dong Wang, Tarek Abdelzaher, and Lance Kaplan. 2015a. Social Sensing: Building Reliable Systems on Unreliable Data.
Morgan Kaufmann.
[128] Dong Wang, Tarek Abdelzaher, Lance Kaplan, Raghu Ganti, Shaohan Hu, and Hengchang Liu. 2013. Exploitation
of physical constraints for reliable social sensing. In Proceedings of the IEEE Real-time Systems Symposium. 212–223.
DOI:10.1109/RTSS.2013.29
[129] Dong Wang, Md Tanvir Al Amin, Shen Li, Lance Kaplan, Siyu Gu, Chenji Pan, Hengchang Liu, Charu Aggarwal,
Raghu K. Ganti, and Xinlei Wang. 2014. Using humans as sensors: An estimation-theoretic perspective. In Proceedings
of the International Symposium on Information Processing in Sensor Networks (IPSN’14). 35–46. DOI:10.5555/2602339.
2602344
[130] Dong Wang and Chao Huang. 2015. Confidence-aware truth estimation in social sensing applications. In Proceedings
of the 12th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON’15). 336–344.
DOI:10.1109/SAHCN.2015.7338333
[131] Dong Wang, Lance Kaplan, and Tarek F. Abdelzaher. 2014. Maximum likelihood analysis of conflicting observations
in social sensing. ACM Trans. Sensor Netw. 10, 2 (2014), 1–27. DOI:10.1145/2530289
[132] Dong Wang, Lance Kaplan, Hieu Le, and Tarek Abdelzaher. 2012. On truth discovery in social sensing: A maximum
likelihood estimation approach. In Proceedings of the ACM/IEEE International Conference on Information Processing
in Sensor Networks (IPSN’12). 233–244. DOI:10.1145/2185677.2185737
[133] Dong Wang, Jermaine Marshall, and Chao Huang. 2016. Theme-relevant truth discovery on Twitter: An estima-
tion theoretic approach. In Proceedings of the International AAAI Conference on Web & Social Media (ICWSM’16).
408–416.
[134] William Yang Wang. 2017. “Liar, liar pants on fire”: A new benchmark dataset for fake news detection. (2017), 422–426.
DOI:10.1145/3350546.3352552
[135] Xuezhi Wang, Yu Cong, Simon Baumgartner, and Flip Korn. 2018. Relevant document discovery for fact-checking
articles. In Proceedings of the International Conference Companion on World Wide Web (WWW’18). 525–533.
DOI:10.1145/3184558.3188723
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
6:32 F. Xu et al.
[136] Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Sum, and Gao Jing. 2018. EANN:
Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the ACM SIGKDD Interna-
tional Conference on Knowledge Discovery & Data Mining (KDD’18). 849–857. DOI:10.1145/3219819.3219903
[137] Yaqing Wang, Weifeng Yang, Fenglong Ma, Jin Xu, Bin Zhong, Qiang Deng, and Jing Gao. 2020. Weak supervision for
fake news detection via reinforcement learning. In Proceedings of the 34th AAAI Conference on Artificial Intelligence
(AAAI’20). 516–523. DOI:10.1609/aaai.v34i10.7230
[138] Feng Wei, Yan Zheng, Hengrun Zhang, Zeng Kai, and Y. Thomas Hou. 2018. A survey on security, privacy and trust
in mobile crowdsourcing. IEEE Internet Things J. 5, 4 (2018), 2971–2992. DOI:10.1109/JIOT.2017.2765699
[139] Penghui Wei, Nan Xu, and Mao Wenji. 2019. Modeling conversation structure and temporal dynamics for jointly
predicting rumor stance and veracity. In Proceedings of the Conference on Empirical Methods in Natural Language
Processing and the International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 4787–4798.
DOI:10.18653/v1/D19-1485
[140] Weiming Wen, Songwen Su, and Yu Zhou. 2018. Cross-lingual cross-platform rumor verification pivoting on mul-
timedia content. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’18).
3487–3496. DOI:10.18653/v1/D18-1385
[141] Liang Wu and Huan Liu. 2018. Tracing fake-news footprints: Characterizing social media messages by how they
propagate. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(WSDM’18). 1–9. DOI:10.1145/3159652.3159677
[142] Lianwei Wu, Yuan Rao, Haolin Jin, Ambreen Nazir, and Ling Sun. 2019. Different absorption from the same sharing:
Sifted multi-task learning for fake news detection. In Proceedings of the Conference on Empirical Methods in Natural
Language Processing and the International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 4644–
4653. DOI:10.18653/v1/D19-1471
[143] Rui Xia, Kaizhou Xuan, and Yu Jianfei. 2020. A state-independent and time-evolving network for early rumor detec-
tion in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20).
9042–9051. DOI:10.18653/v1/2020.emnlp-main.727
[144] Sihong Xie, Wang Guan, Shuyang Lin, and Philip S. Yu. 2012. Review spam detection via temporal pattern discovery.
In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’12). 823–
831. DOI:10.1145/2339530.2339662
[145] Fan Xu, Victor S. Sheng, and Mingwen Wang. 2020. Near real-time topic-driven rumor detection in source microblogs.
Knowl.-Based Syst. 207 (2020), 1–9. DOI:10.1016/j.knosys.2020.106391
[146] Guowen Xu, Hongwei Li, Tan Chen, Dongxiao Liu, Yuanshun Dai, and Yang Kan. 2017. Achieving efficient and
privacy-preserving truth discovery in crowd sensing systems. Comput. Secur.y 69 (2017), 114–126. DOI:10.1016/j.
cose.2016.11.014
[147] Guowen Xu, Hongwei Li, Dongxiao Liu, Ren Hao, Yuanshun Dai, and Xiaohui Liang. 2016. Towards efficient privacy-
preserving truth discovery in crowd sensing systems. In Proceedings of the IEEE Global Communications Conference.
1–6. DOI:10.1109/GLOCOM.2016.7842343
[148] Guowen Xu, Hongwei Li, Sen Liu, Mi Wen, and Rongxing Lu. 2019. Efficient and privacy-preserving truth discovery
in mobile crowd sensing systems. IEEE Trans. Vehic. Technol. 68, 4 (2019), 3854–3865. DOI:10.1109/TVT.2019.2895834
[149] Kai-Chou Yang, Timothy Niven, and Kao Hung-Yu. 2019. Fake news detection as natural language inference. arXiv:
1907.07347v1 (2019).
[150] Shuo Yang, Kai Shu, Suhang Wang, Renjie Gu, Fan Wu, and Huan Liu. 2019. Unsupervised fake news detection on
social media: A generative approach. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’19).
5644–5651. DOI:10.1145/3372923.3404783
[151] Xiaoyu Yang, Yuefei Lyu, Tian Tian, Yifei Liu, Yudong Liu, and Xi Zhang. 2020. Rumor detection on social media with
graph structured adversarial learning. In Proceedings of the 29th International Joint Conference on Artificial Intelligence
(IJCAI’20). 1417–1423. DOI:10.24963/ijcai.2020/197
[152] Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Haitao Zheng, and Ben Y. Zhao. 2017. Automated crowdturfing attacks
and defenses in online review systems. In Proceedings of the ACM Conference on Computer and Communications
Security. 1143–1158. DOI:10.1145/3133956.3133990
[153] Junting Ye, Santhosh Kumar, and Leman Akoglu. 2016. Temporal opinion spam detection by multivariate indicative
signals. In Proceedings of the International AAAI Conference on Web & Social Media (ICWSM’16). 743–746.
[154] Jie Yin, Andrew Lampert, Mark Cameron, Bella Robinson, and Robert Power. 2012. Using social media to enhance
emergency situation awareness. IEEE Intell. Syst. 27, 6 (2012), 52–59. DOI:10.1109/MIS.2012.6
[155] Seunghyun Yoon, Kunwoo Park, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha, and Kyomin Jung. 2019.
Detecting incongruity between news headline and body text via a deep hierarchical encoder. In Proceedings of the
Thirtieth AAAI Conference on Artificial Intelligence (AAAI’19). 791–800.
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.
A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing 6:33
[156] Jianfei Yu, Jing Jiang, Ling Min Serena Khoo, Hai Leong Chieu, and Rui Xia. 2020. Coupled hierarchical transformer
for stance-aware rumor verification in social media conversations. In Proceedings of the Conference on Empirical
Methods in Natural Language Processing (EMNLP’20). 1392–1401. DOI:10.18653/v1/2020.emnlp-main.108
[157] Chunyuan Yuan, Qianwen Ma, Wei Zhou, Jizhong Han, and Songlin Hu. 2019. Jointly embedding the local and global
relations of heterogeneous graph for rumor detection. In Proceedings of the IEEE International Conference on Data
Mining (ICDM’19). 796–805. DOI:10.1109/ICDM.2019.00090
[158] Chunyuan Yuan, Qianwen Ma, Wei Zhou, Jizhong Han, and Hu Songlin. 2020. Early detection of fake news
by utilizing the credibility of news, publishers, and users based on weakly supervised learning. In Proceedings
of the 28th International Conference on Computational Linguistics (COLING’20). 5444–5454. DOI:10.18653/v1/2020.
coling-main.475
[159] Savvas Zannettou, Michael Sirivianos, Jeremy Blackburn, and Nicolas Kourtellis. 2019. The web of false information:
Rumors, fake news, hoaxes, clickbait, and various other shenanigans. J. Data Inf. Qual. 3 (2019), 1–37. DOI:10.1145/
3309699
[160] Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Cho Yejin. 2019.
Defending against neural fake news. arXiv: 1905.12616v1 (2019).
[161] Li Zeng, Kate Starbird, and Emma S. Spiro. 2016. Rumors at the speed of light? Modeling the rate of rumor trans-
mission during crisis. In Proceedings of the Hawaii International Conference on System Sciences (HICSS’16). 1969–1978.
DOI:10.1109/HICSS.2016.248
[162] Chuan Zhang, Liehuang Zhu, Xu Chang, Kashif Sharif, Xiaojiang Du, and Mohsen Guizani. 2019. LPTD: Achieving
lightweight and privacy-preserving truth discovery in CIoT. Fut. Gen. Comput. Syst. 90 (2019), 175–184. DOI:10.1016/
j.future.2018.07.064
[163] Chuan Zhang, Liehuang Zhu, Chang Xu, Kashif Sharif, and Ximeng Liu. 2019. PPTDS: A privacy-preserving truth
discovery scheme in crowd sensing systems. Inf. Sci. 484 (2019), 183–196. DOI:10.1016/j.ins.2019.01.068
[164] Daniel Zhang, Dong Wang, Nathan Vance, Yang Zhang, and Steven Mike. 2018. On scalable and robust truth discov-
ery in big data social media sensing applications. IEEE Trans. Big Data (2018), 195–208. DOI:10.1109/TBDATA.2018.
2824812
[165] Jing Zhang, Victor S. Sheng, and Xindong Wu. 2015. CEKA: A tool for mining the wisdom of crowds. J. Mach. Learn.
Res. 16 (2015), 2853–2858. DOI:10.5555/2789272.2912090
[166] Jing Zhang, Xindong Wu, and Victor S. Sheng. 2016. Learning from crowdsourced labeled data: A survey. Artif. Intell.
Rev. 46, 4 (2016), 543–576. DOI:10.1007/s10462-016-9491-9
[167] Zhao Zhe, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring minds: Early detection of rumors in social media from
enquiry posts. In Proceedings of the International Conference Companion on World Wide Web (WWW’15). 1395–1405.
DOI:10.1145/2736277.2741637
[168] Yifeng Zheng, Huayi Duan, and Wang Cong. 2018. Learning the truth privately and confidently: Encrypted
confidence-aware truth discovery in mobile crowdsensing. IEEE Trans. Inf. Forens. Secur. 13 (2018), 2475–2489.
DOI:0.1109/TIFS.2018.2819134
[169] Yifeng Zheng, Huayi Duan, Xingliang Yuan, and Cong Wang. 2017. Privacy-aware and efficient mobile crowdsensing
with truth discovery. IEEE Trans. Depend. Sec. Comput. 17, 1 (2017), 121–133. DOI:10.1109/TDSC.2017.2753245
[170] Xinyi Zhou and Reza Zafarani. 2018. Fake news: A survey of research, detection methods, and opportunities.
arXiv: 1812.00315v1 (2018).
[171] Dimitrina Zlatkova, Preslav Nakov, and Ivan Koychev. 2019. Fact-checking meets fauxtography: Verifying claims
about images. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and
the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 2099–2108. DOI:10.18653/
v1/D19-1216
[172] Arkaitz Zubiaga, Ahmet Aker, Kalina Bontcheva, Maria Liakata, and Rob Procter. 2017. Detection and resolution of
rumours in social media: A survey. Comput. Surv. 51, 2 (2017). DOI:10.1145/3161603
[173] Arkaitz Zubiaga and Heng Ji. 2014. Tweet, but verify: Epistemic study of information verification on Twitter. Social
Netw. Anal. Mining 4, 1 (2014), 1–12. DOI:10.1007/s13278-014-0163-y
[174] Arkaitz Zubiaga, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, and Peter Tolmie. 2016. Analysing how
people orient to and spread rumours in social media by looking at conversational threads. PLoS One 11, 3 (2016).
DOI:10.1371/journal.pone.0150989
ACM Computing Surveys, Vol. 55, No. 1, Article 6. Publication date: November 2021.