Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Journal of Ambient Intelligence and Humanized Computing (2022) 13:4663–4679

https://doi.org/10.1007/s12652-021-03512-2

ORIGINAL RESEARCH

Design of a NLP‑empowered finance fraud awareness model:


the anti‑fraud chatbot for fraud detection and fraud classification
as an instance
Jia‑Wei Chang1 · Neil Yen2,3 · Jason C. Hung1

Received: 30 December 2020 / Accepted: 9 September 2021 / Published online: 21 March 2022
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022

Abstract
Advanced technologies, Internet of things and fundamental information communication technology frameworks in particular,
facilitate information sharing. One simple click-on end device can make every tool accessible to users; however, whether cor-
rect information is received remains to be an open question. Incorrect information that bundles the factors of fake, malicious,
or fraudulent information, whether deliberately or not, may worsen misunderstandings. To avoid these cases escalating to the
level of crime, a universal financial fraud-awareness model was designed in this study. The model first targets accurate fraud
detection and classification using the natural language processing technique. An anti-fraud chatbot is then implemented as
an instance of the model and deployed on a widely used social network service, namely LINE. This implementation aims
to manage finance-fraud cases and provide anti-fraud suggestions to deal with foreseeable fraud events. Statistics of the
comparison between Word2vec, ELMO, BERT, and DistilBERT on the five-strong conventional machine-learning models
and the models of artificial neural networks indicate that the proposed model can achieve an accuracy of over 98% while
detecting potential finance-fraud cases. In addition, the more efficient models by DistilBERT with a support vector machine
or a random forest have lower resource-computation cost and faster execution time in real applications.

Keywords Natural language processing · Fraud detection · Fraud classification · Context awareness · Machine learning ·
Smart city service

1 Introduction and specific classification of fraudulent events to provide


information for decision analysis and operational actions
With the proliferation of mobile devices and social media, performed by users in different roles.
people have a large number of channels to receive informa- Statistics from 2019 indicate that cybercrime resulted in
tion; likewise, criminals and criminal syndicates gain ille- financial losses of $3.5 billion, with spoofing fraud causing
gal profits through anonymity, forgery, and untraceability. losses of approximately $300 million (Federal Bureau of
Technologies such as spam, intrusion detection systems, Investigation 2020) and credit card fraud, where the pop-
computer viruses, malware detection, and spoofing are ularity of electronic transactions has increased new fraud
important for real-time determination of malicious messages opportunities, costing companies and individuals millions of
dollars (Adewumi and Akinyelu 2017). Internet and financial
fraud have also caused serious financial losses and crime
* Neil Yen in Taiwan, swamping the society with problems and crises.
neil219@gmail.com According to the Public Security Ministry statistics cited
1 by state media, China experienced over 590,000 fraudulent
Department of Computer Science and Information
Engineering, National Taichung University of Science incidents and losses of approximately $3.4 billion in 2015
and Technology, Taichung City, Taiwan (Martina and Wu 2016). Therefore, it is very important to
2
School of Computer Science and Information Security, prevent the ever-changing financial frauds and raise public
Guilin University of Electronic Technology, Guilin, China awareness regarding protection. Chen et al. (2017) reported
3
School of Computer Science and Engineering, University that common malicious social engineering, including fake
of Aizu, Aizuwakamatsu, Japan claims, credit card transactions, online purchases, and

13
Vol.:(0123456789)
4664 J.-W. Chang et al.

mobile payments, often occurs on smartphones. To be effec- a chatbot to actively support specific users who are not aware
tive and instantaneous, it is necessary to use a smartphone to of foreseeable fraud events in certain scenarios.
ensure that fraudsters are aware of fraud. To detect financial The rest of the paper is organized as follows. Section 2
fraud in social networks effectively and in real time, data- summarizes the previous works related to this study; Sect. 3
analytics techniques are important, and they can be divided presents the proposed method and its implementation;
into two categories: statistical-based methods and machine Sect. 4 discusses the experimental design and the obtained
learning-based techniques. results; and Sect. 5 provides a conclusion and reports on the
A large amount of resources worldwide, in any form, have potential research directions in the field.
been utilized to prevent financial fraud. For instance, Taiwan
has developed a number of systems to support the public’s
identification of fake news and incorrect information. Exam- 2 Related work
ples such as Cofacts,1 MyGoPen,2 and 165 National Fraud
Prevention Networks3 can be easily observed. In addition, The idea of word-to-vector (Word2vec) was first proposed by
people make real-time inquiries, such as Auntie Mei Yu,4 Mikolov et al. (2013). Word2vec is a type of unsupervised
and use them. However, most systems focus only on fake learning that follows the distributional hypothesis and is the
messages or fake news and rely on manual queries or corre- first well-known approach of word embedding using deep
lations with known events in the past as a method to identify neural networks. The core concept is that a target word is
whether they are fake or not. However, this approach leads typically related to its adjacent words; this may be regarded
to poor efficiency of queries and can only prevent the same as the words having similar semantic distances. Word2vec
case techniques as in the past. Once the criminals change can be trained by Continuous Bad of Words (CBOW) and
presentation styles, such as methods of expression and talk- Skip-gram modes; (1) CBOW predicts the current word
ing skills, the experiences learned from past studies may not using the previous and subsequent words, and (2) Skip-gram
fit well and help solve the issues (Chen et al. 2017). Open is used to predict the previous and subsequent words using
issues were left unsolved; these include the following: the current word.
Previous approaches have proved that Word2vec out-
• Accuracy and impartiality on the fraud information performs traditional textual representations. For example,
from limited sources are yet to be resolved. A convinc- Lilleberg et al. (2015) claimed that text classification is
ing mechanism to automatically judge the known frauds widely regarded as a supervised learning task and defined
is still pending for development, which may lead to more the identification of new text categories based on a train-
unexpected fraud in the gray zone. ing model of tagged text. As the number of available text
• Fraud awareness is low as only a passive search on static messages increases, it becomes more difficult to classify
information service is provided. Active push notifications them correctly. However, traditional textual representations
are expected in the correct situation. (e.g., term frequency inverse document frequency or TF-
• Imbalanced requirements on information security (for IDF) ignore semantics and word order, which may be detri-
fraud prevention) and user experience (for purposes of mental to subsequent classification efforts. They proposed
simplicity) still exist, leading to the continuation of fraud a weighted TF-IDF with Word2vec to generate semantic
cases. features. The support vector machine (SVM) model with
the proposed semantic features achieved an accuracy rate of
Therefore, this study focused on preventing financial- 89.72%, which is better than that of the traditional TF-IDF
related fraud and developing models to identify known fraud (89.45%) with only stop words. In addition, Wensen et al.
and processes to respond to new forms of fraud. A chatbot, (2016) combined Wikipedia and Word2vec to solve the spar-
implemented on a social network service named LINE, was sity problem of short textbook classifications. The results
then used to passively provide user inquiries and feedback showed that the average precision, recall, and F1 score were
on anti-fraud recommendations. Two main issues were tar- 78% without Word2vec extension but increased to 80% with
geted in this study: (1) designing a universal model to form text extension. All the three measures of Word2vec are bet-
fraud patterns and perform detection, and (2) implementing ter than those of the traditional statistics-based Wikipedia
link-based measure (WLM).
Following the idea of Word2vec, embedding from
language model (ELMO) was proposed by Peters et al.
1
https://​cofac​ts.​g0v.​tw/. (2018). ELMO addresses the problem of Word2vec. In dif-
2
https://​www.​mygop​en.​com/. ferent contexts, Word2vec returns the same vector repre-
3
https://​165.​npa.​gov.​tw/. sentation to the target word, but the target word may have
4
https://​www.​check​check.​me/. different meanings in different contexts. Therefore, ELMO

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4665

can input the entire text into the model once, and the achieves 60% faster processing speed while retaining 97%
model infers the semantic vector of the target word for the of the performance, compared to those of BERT.
corresponding context. Previous studies prove that ELMO From the perspective of finance-fraud prevention using
has considerable potential for textual representation. For NLP, several instances have been found as follows. Hajek
example, Ling et al. (2020) proposed a new hybrid neural and Henriques (2017) mentioned that financial statement
network model to classify microblog posts into negative fraud has always been a concern for investors and the gov-
or positive sentiments. Their model first designs an input ernment. Fraudulent commentary in financial statements has
layer using the semantic vector from a pre-trained ELMO been observed in recent studies, and it is desirable to develop
model and POS, N-Gram, and emoticon features, followed a financial fraud detection system. The experimental process
by long short-term memory (LSTM) and convolutional starts by assembling a dataset of nine financial variables that
neural network (CNN) to extract the textual representation, can identify financial irregularities and an MD&A textual
and finally, a fully connected layer is used to predict the dashboard from the manager's perspective, followed by a
sentiment. The experiment was evaluated using a manually correlation-based feature selection method and a BestFirst
marked dataset, and the best results obtained were 85.69, search method to reduce the dimensionality, and finally,
86.33, and 86.71% for precision, recall, and F1 scores, fourteen learning methods for fraud classification; the high-
respectively. In addition, Sun et al. (2019) proposed a deep est accuracy (90.32%) was obtained by the Bayesian belief
context stacking model (DS-LSTM), which uses ELMO to network, while the lowest accuracies were obtained by logi-
effectively generate features for the input sentences and cal regression and Navier Bayesian. The study conducted by
then uses a keyword attention mechanism to determine the Jurgovsky et al. (2018) mentions that credit card fraud has
word weights. The experimental results show that double become a major challenge for financial institutions because
stacked long short-term memory (DS-LSTM) with ELMO of the popularity of electronic payments; thus, fraudulent
achieved the best F1-score (69.44%) on the CHEMPROT transactions are regarded as anomalous purchase behaviors,
database, proving that ELMO can provide very rich con- and the fraud detection problem is treated as a serial classifi-
textual information. Compared to ELMO and previous cation task. To better distinguish between the two, LSTM is
word embedding models, bidirectional encoder represen- used to model the dynamic transformation between transac-
tations from transformers (BERT) were proposed to solve tions to recover the sequential structure of customer transac-
the issues of longer-distance dependency relationships and tion history and traditional feature integration engineering to
comprehensive contextual information through the masked identify cardholder purchase activity. The experiment uses
language model (Devlin et al. 2018). two months of credit card transaction records, and experts
The BERT model has been widely used in recent years mark whether the transactions are fraudulent or not. The
and has been shown to be a state-of-the-art model for the experimental evaluation was based on the area under curve
downstream tasks of natural language processing (NLP). For (AUC) below the Precision-Recall (PR) curves. Random
example, Aggarwal et al. (2020) proposed a binary classifi- Forest and LSTM obtained 0.242 and 0.236, respectively,
cation model based on BERT for fake-news recognition. The for face-to-face transactions, and 0.404 and 0.402 for e-com-
experimental results show that the BERT model achieves the merce. LSTM can improve the detection accuracy of offline
best accuracy (97.021%) on the NewsFN dataset. In addition, transactions for cardholders' purchases in stores, but it is
Rexha et al. (2020) proposed a BERT-based classification not better than random forest for online transactions. LSTM
model for many small datasets including customer reviews can improve the detection accuracy of offline transactions
(CR), MPQA, short movie reviews (Rt10k), and subjectiv- for cardholders' purchases in stores, but it is not better than
ity (Subj). Compared to multinomial naïve Bayes (MNB), random forest for online transactions. Therefore, this study
naïve Bayes SVM (NBSVM), and recursive auto-encoder considers the above-mentioned issues and concentrates on
(RAE), the experimental results show that the BERT-based the design of a fraud-awareness model to support the spread
model achieved improvements of approximately 5% on accu- of frauds through social network services. In addition to the
racy for MPQA and Subj and approximately 10% on CR previous cases, the focus is on the Chinese terms.
and Rt10k datasets. Howerver, the BERT model has nearly
100 million parameters and the training time is too long, its
optimization is required for lower computing resources. As 3 Design of finance fraud‑awareness model
an improvement of the BERT model, DistilBERT applies and anti‑fraud chatbot
knowledge distillation to achieve a small model classifica-
tion and achieves an accuracy similar to that of BERT (Sanh This study focused on two major parts; finance fraud-
et al. 2019). The architecture of DistilBERT is similar to that awareness model and Rasa chatbot module. The relationship
of BERT but only requires 40% or fewer parameters and 50% between them, as well as the system scenario, is illustrated
or fewer hidden layers. The results show that DistilBERT in Fig. 1. Bocklisch et al. (2017) developed an open-source

13
4666 J.-W. Chang et al.

Fig. 1  Framework of anti-fraud


chatbot

Fig. 2  Use case of interaction


with anti-fraud chatbot

natural language dialog framework for building contextual replies to a legitimate situation. If the intent is illegal, the
dialogs, namely Rasa. Rasa NLU and Rasa core are the two second component, fraud classification, classifies the query
main components of Rasa. Rasa NLU is responsible for into seven known illegal events and provides a correspond-
understanding the semantic message of a sentence, and Rasa ing solution. If the intent is pending confirmation, the chat-
Core is responsible for the task and strategy of the conversa- bot notifies the user that the event cannot be confirmed and
tion. Therefore, customized functions can be developed and will send further messages after human confirmation.
integrated into the open-source framework.
In this study, a finance fraud-awareness model was inte- 3.1 Model settings of fraud detection/classification
grated into Rasa chatbot, and the model can be divided into
two components: fraud detection and fraud classification. In this study, fraud detection and classification models have
The first component is fraud detection, which classifies the the same architecture, but the outputs of the two models are
user’s query as legal, illegal, or pending confirmation. In different. The flowchart of fraud detection and fraud clas-
terms of use case, the user should input a query to Rasa NLU sification is shown in Fig. 3, which can be divided into four
through the chatbot, as shown in Fig. 2. Rasa NLU parses the steps:
text and passes it to the fraud-detection model to determine
its intent. If there is not enough information, the chatbot will Step 1. For the fraud-detection model, we defined three
conduct an instructional conversation with the user to collect scenarios—legitimate, illegitimate, and pending confir-
the required information. If the intention is legal, the chatbot mation—as the output; under the illegitimate scenario,

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4667

Fig. 3  Flowchart for the training of fraud detection and classification models

Table 1  A sample list of Chinese Stop words


Chinese stop words
$、0、1、2、3、4、5、6、7、8、9、? 、_、一、一些、一何、一切、一則、
一方面、一旦、一來、一樣、一般、一轉眼、不僅、不但、不光、不單、不只、
不外乎、不成、不拘、不料、不是、不比、不然、不特、不獨、不管、不至於

the common fraud events are categorized into seven types single words for sentence analysis and subsequent pro-
for the fraud-classification model in this study. cessing. In this study, we used the Jieba module based on
Step 2. Delete stop words and meaningless phrases from the Trie Tree structure, which is an open-source Python
the database to reduce the bias of unimportant words. package for the Chinese language to cut words for the
Step 3. Semantic features are extracted using word- database. The second step is to perform stop-word process-
embedding methods including Word2vec, ELMO, BERT, ing. Because the language corpus contains many unimpor-
and DistilBERT. tant words, such as tone words, numbers, and symbols,
Step 4. Input the semantic features into six classifica- it reduces the influence of these words on the model and
tion models, namely random forest, naïve Bayesian, increases the accuracy of the model. Therefore, we create
SVM, adaptive boosting (Adaboost), K-nearest neighbor a stop-word list, as shown in Table 1, and then, we can
(KNN), and artificial neural network (ANN). proceed to the next stage to extract the features of utter-
ance by eliminating all stop words from the stop-word list.
3.1.1 Fraud events collection

To build a model that can identify fraudulent behaviors, it 3.1.3 Semantic feature extraction
is necessary to collect and analyze a large number of fraud
events. From the collected data, we inferred that the seven When training subsequent classification models, the accu-
most common fraud events in Taiwan are related to ATMs, racy of the prediction results is typically affected if differ-
shopping websites, mobile payments, and counterfeiting. ent features are selected in the previous training. In recent
Owing to the wide variety of communication channels and years, owing to the growth of deep learning, deep repre-
suspected fraudulent reasons, it is impossible to know what sentation learning has become the mainstream method for
channels and reasons the fraudsters will use to commit fraud. generating effective eigenfeatures. Therefore, in this study,
Therefore, 12 people were assigned to change the channels we begin with deep representation learning to find and
of communication and the reasons for fraudulent behaviors, compare relevant feature extraction methods.
and a database of the seven types of fraud was created using To learn deep representations from sentences, we
spoken language. It aims to increase the generalizability of extracted features from a database of fraudulent scenarios
the model by enabling different people to describe the same by removing deactivated words from each content. In this
fraudulent behavior in different ways. study, four representative embedding models were used,
and the experiments were used to determine the parameter
3.1.2 NLP settings of different embedding models to obtain the best
feature extraction method in the financial fraud domain
The first step after building a database is to perform word (see Table 2).
slicing, which involves cutting the entire sentence into

13
4668 J.-W. Chang et al.

Table 2  Parameter settings of semantic feature extraction


Word2vec ELMO BERT DistilBERT

Pre-processing 1. Tokenize None


2. Stop word
Input 985 tokenized texts 985 non-tokenized texts
Parameter Gensim Tensorflow hub Transformers Transformers
size: 300 trainable: True model: Bert model: DistilBert
min_count: 1 signature: default tokenizer: tokenizer:
as_dict: True BertTokenizer DistilBertTokenizer
pretrained_weights: pretrained_weights:
bert-base-uncased distilbert-base-uncased
dropout: 0.1 dropout: 0.1
dim: 768 dim: 768
hidden_dim: 3072 hidden_dim: 3072
Output 300 dimensions 1024 dimensions 768 dimensions 768 dimensions

Table 3  Parameter settings of classification models with conventional machine learning

Model Random forest Naïve Bayes SVM Adaboost KNN

Input 985 texts of the four embeddings (Word2vec, ELMO, Bert, DistilBert)
Parameters sklearn: sklearn: sklearn: sklearn: Sklearn:
RandomForestClassifier GaussianNB SVC AdaBoostClassifier KNeighborsClassifier
n_estimators: [10,25,50,100] kernel: n_estimators: n_neighbors:
[rbf, poly, linear] [25,50,100] [3,25,100]
C: (Internet Crime Report Released random_state: None
2019)
degree: (Sun et al. 2019)
gamma: [0.5,1e−3, 1e−4]
Output predict(X_test)
Evaluation accuracy_score(y_true, y_pred)

3.1.4 Classification models Table 4  Parameter settings of classification models with deep learn-
ing
After word embedding extracts the features of the content, Model ANN
this study uses six classification models, including random
Input 985 texts of the four embeddings
forest, Navier Bayes, SVM, Adaboost, KNN, and ANN, to
(Word2vec, ELMO, Bert,
perform experiments with four different features generated DistilBert)
by word embedding and grid search for machine-learning Parameters Keras
search. The best parameter for determining which word input_shape: The word embed-
embedding combinations with multiple classifiers have the ding output dimension
activation: [relu, linear]
highest classification accuracy will be used as a model to
layer (Adewumi and Akinyelu
identify new fraudulent behaviors in the future, as shown 2017; Martina and Wu 2016)
in Tables 3 and 4. batch_size: 128
epochs: 500
3.2 Case study Output val_loss, val_accuracy
Evaluation validation_data = (X_test, y_test)
3.2.1 Selection of data source from fraud open platforms

In addition to reporting financial fraud information by users, 1. Users report fraud events through the line chatbot:
we collect data from various public websites to accurately A reporting mechanism is established to allow users to
determine whether users have been a victim of fraud. The proactively provide information about the fraud events
following is an introduction to various channels to collect that they encounter. Once submitted, it will be manu-
fraud events. ally verified, and if it is determined to be fraudulent, the

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4669

Fig. 4  Rasa NLU and Rasa


Core Flowchart

Fig. 5  Intent and entity identi-


fication from conversations by
Rasa NLU

event will be included in the database training model framework called “Rasa” is applied as the backend basis for
and scored based on the frequency of queries. Once the the chatbot (Bocklisch et al. 2017).
event score is above a predetermined threshold, a pro- Rasa consists of two modules: Rasa NLU (see the blue
active push is made to prevent further users from being part in Fig. 4) is used to understand the user’s semantics,
scammed. identify intent and entity in sentences, and convert them
2. Cofacts Database: The Cofacts5 database classifies into structured data for Rasa Core. In addition, Rasa Core
all events into four categories. Each event contains the (see the green part in Fig. 4) is a conversation management
original and existing responses, with the reason for the platform that can predict the most appropriate next actions
fraud mark, and its source with URL. The system editors based on conversation history.
also conducted checks to increase the credibility of the Rasa NLU represents sentences as vectors by pre-trained
content. Because the database contains rumors and fake word embeddings to train the intention classifier to remain
news, information that is not related to financial fraud robust. After the aggregation platform is organized, all the
will be excluded before data collection. mapping and corresponding keyword results obtained in
3. National Police Agency, Ministry of the Interior: The this study are created in the Rasa NLU to help the chatbot
National Police Agency mainly provides information identify and categorize user narratives. Figure 5 shows the
such as news, rumor dispelling, high-risk stores, fraudu- most trusted intentions and keywords selected by Rasa NLU
lent calls, and anti-fraud propaganda in the 165 National when Rasa receives user messages. Therefore, the proposed
Fraud Prevention Network.6 The website is mainly for fraud detection and classification models can be integrated
the public to prevent fraud propaganda; thus, it uses into the Rasa NLU.
easy-to-understand text with comic style descriptions In the language-generating part, various responses can be
for the public. As this website is launched by a govern- generated through multiple templates, which is an easier and
ment agency, it is more reliable than other websites. more reliable method than using neural networks to generate
responses (Wen et al. 2015).
3.2.2 Implementation of anti‑financial fraud robot The narrative is the conversation flowchart between a
chatbot and a user under a specific scenario, and the story
To build an anti-financial fraud chatbot, it is necessary to definition is processed in Rasa Core. A variety of scenarios
select a backend model that meets the requirements of this are required to guide the chatbot. For example, the design of
study and connect it to Line’s chatbot as the frontend chan- a dialogue flow that greets the user and where the response
nel because of the need to process the user’s conversations may be positive or negative is visualized in Fig. 6 (blue:
in a natural language. In this study, an open-source dialog intent, white: reply message).
In addition, a “domain” refers to the knowledge base of
a chatbot that defines the intent and entity categories, the
5
https://​cofac​ts.​g0v.​tw/ types of messages to respond to, the content of the messages
6
https://​165.​npa.​gov.​tw/

13
4670 J.-W. Chang et al.

illegal contexts, pending-confirmation contexts, new-event


contexts, and non-interrogative-intention contexts. As the
user query is identified as the legal intent, the following con-
textual flow was designed to respond to a legitimate situa-
tion, as shown in Fig. 7.
From the collected data, there are many keywords con-
sidered to be financial fraud in Taiwan; once these keywords
appear in the content or if there are certain unreasonable
keyword combinations, the content can be determined as
fraudulent, as shown in Fig. 8. The linguistic content of
illegal situations is listed in the introduction to the dataset.

3.2.3 Implementation of fraud detection

To develop an intention-discrimination model for the finan-


cial fraud-prevention chatbot, this study organized and
analyzed a fraud event-aggregation platform and classified
the financial-related intentions entered by users into three
categories: legal, illegal, and pending confirmation. The
sub-categories of each category are listed, and these cases
are used to train a fraud-detection model. Each of the three
categories is defined below, and detailed information is pro-
Fig. 6  Conversation flowchart in story definition vided in the introduction to the dataset.

1. Legal: By organizing the collected fraud data to list the


legitimate financial content that people often ask about,
the fraud-detection model determines a situation as
legitimate only if the user explicitly mentions a legiti-
mate type of language. Since some frauds use the name
of a government or a corporation to create content that is
not within the scope of the entity’s business or cannot be
present at the same time as the entity, this study defines
a legitimate situation as one in which a legitimate entity
is doing something legitimate within its purview. It is
important to note that the suspected fraudulent reasons
are those that have actually occurred in Taiwan, but if
they occur in the corresponding legal entity, they will
still be considered legal. Exceptional behavior will be
explained in detail later under the category of illegal
behavior.
2. Illegal: In the illegal category, in addition to the com-
mon financial fraud in Taiwan, there is a list of multiple
intentions that match the game points. Typically, normal
Fig. 7  Examples of the replies to legal intent financial-related intentions belong to the legal category,
but when two normal intentions appear at the same time,
they may turn into illegal intentions; e.g., water bills
to respond to, the messages needed to be collected, and the and game points both belong to the legal category. For
fallback action. example, water fees and game points belong to the legal
To classify chat robots accurately and respond to differ- category. However, if the situation of “using game points
ent user inputs, it is necessary to conceive the conversa- to pay water fee” is unreasonable, it will change from
tion scenarios and flow between chat robots and users. In a legal category to an illegal category. Therefore, this
this study, we classified the user’s story into the following study treats game points as a special behavior that needs
categories: beginning and ending greetings, legal contexts, to be judged separately and defines illegal situations as

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4671

Fig. 8  Examples of the replies of illegal contexts

illegal keywords collated from a fraud centralized plat- categories: 625 legal, 674 pending confirmation, and 985
form and legitimate entities doing illegal or out-of-office illegal (including seven types of fraud events) situations,
business. and the data were divided into training and testing sets in a
3. Pending Confirmation: By analyzing Taiwan’s recent ratio of 8:2 to provide a basis for subsequent validation of
profile from the fraud event aggregation platform, we each model. Seven types of fraud events were sampled and
know that the reasons for financial fraud are varied and are presented in the Appendix Table 15 for further reference.
unpredictable; thus, it is not possible to train a model
on the reasons for financial fraud. However, this study 4.2 fraud‑detection model
found that when an unscrupulous person finally wants
to transfer the victim’s money, the most common chan- This study uses a grid search to achieve the best accuracy
nels are ATMs, counters, shopping sites, mobile pay- for each combination and find the best parameters for all
ments, etc.; thus, if one can detect these intentions first, combinations, as shown in Table 5.
it is possible to go back to the beginning. In this study, A user starts by describing the situation to a chatbot. The
the situation to be confirmed is defined as a situation in fraud-detection model first extracts and classifies three sce-
which the user only mentions financial keywords, chan- narios based on the content provided by the user. In this
nels, and reasons but does not specify somebody, some- study, we will experiment with four word embedding meth-
thing, or reasons for the above keywords thus requires ods and six multi-classification methods and cross-examine
further questioning by the fraud-detection model. which combination has the highest classification accuracy.
As shown in Table 6, DistilBERT with random forest had the
highest accuracy (98.7%), and DistilBERT had the highest
4 Experiment and statistics accuracy among the feature-extraction methods.
In this study, the execution time is calculated and aver-
The experimental results of this study first present a self- aged for all combinations of fraud-detection models, as
created dataset that will be used to experiment with the shown in Table 7. It was found that the average execution
accuracy of the proposed method. The first part performs time of DistilBERT was nearly half that of BERT.
experiments on the detection of user-entered events by the This study further analyzes the classification of the best
fraud-detection model, and the second part performs experi- fraud-detection model (DistilBERT embedding with the ran-
ments on the effectiveness of the fraud-classification model dom forest classifier) using a confusion matrix, as shown in
when classifying seven known fraudulent behaviors. Table 8. It can be observed that a few strokes of legal and
illegal situations are categorized into pending situations,
4.1 Dataset illustrating that legal and illegal situations are easily con-
founded with pending situations that do not have explicit
Since there is no literature or research that provides a Chi- keywords or directions.
nese version of the financial fraud events, this study builds The final report of the classification results for the fraud-
this dataset through a pooled platform. There are three major detection model is presented in Table 9. The overall model

13
4672 J.-W. Chang et al.

Table 5  Parameter optimization Word embedding Classifier Parameters Accuracy


for fraud detection by grid
search according to accuracy Word2vec Random Forest n_estimators: 10 0.881 (±0.141)
n_estimators: 25 0.910 (±0.072)
n_estimators: 50 0.899 (±0.104)
n_estimators: 100 0.911 (±0.100)
Naïve Bayes none 0.742 (±0.105)
SVM kernel: linear 0.757 (±0.167)
kernel: poly 0.813 (±0.152)
C:1, gamma: 0.001, kernel: rbf 0.152 (±0.006)
C: 1, gamma: 0.0001, kernel: rbf 0.152 (±0.006)
C: 100,gamma: 0.001,kernel: rbf 0.311 (±0.260)
C: 100, gamma: 0.000, kernel: rbf 0.152 (±0.006)
Adaboost n_estimators: 25 0.824 (±0.078)
n_estimators: 50 0.776 (±0.152)
n_estimators: 100 0.767 (±0.148)
KNN n_neighbors: 3 0.805 (±0.091)
n_neighbors: 25 0.774 (±0.130)
n_neighbors: 100 0.757 (±0.125)
ANN layer: 2, activation: relu 0.909
layer: 2, activation: linear 0.895
layer: 3, activation: relu 0.908
layer: 3, activation: linear 0.838
ELMO Random Forest n_estimators: 10 0.926 (±0.039)
n_estimators: 25 0.957 (±0.022)
n_estimators: 50 0.962 (±0.026)
n_estimators: 100 0.965 (±0.025)
Naïve Bayes none 0.814 (±0.069)
SVM kernel: linear 0.968 (±0.016)
kernel: poly 0.938 (±0.023)
C: 1, gamma: 0.001, kernel: rbf 0.939 (±0.030)
C: 1, gamma: 0.0001, kernel: rbf 0.831 (±0.031)
C: 100, gamma: 0.001, kernel: rbf 0.979 (±0.018)
C: 100 0.935 (±0.033)
gamma: 0.0001 kernel: rbf
Adaboost n_estimators: 25 0.815 (±0.054)
n_estimators: 50 0.867 (±0.049)
n_estimators: 100 0.895 (±0.043)
KNN n_neighbors: 3 0.955 (±0.036)
n_neighbors: 25 0.890 (±0.057)
n_neighbors: 100 0.822 (±0.073)
ANN layer: 2, activation: relu 0.979
layer: 2, activation: linear 0.963
layer: 3, activation: relu 0.975
layer: 3, activation: linear 0.965

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4673

Table 5  (continued) Word embedding Classifier Parameters Accuracy

BERT Random Forest n_estimators: 10 0.968 (±0.022)


n_estimators: 25 0.975 (±0.025)
n_estimators: 50 0.982 (±0.014)
n_estimators: 100 0.981 (±0.020)
Naïve Bayes default 0.749 (±0.046)
SVM kernel: linear 0.984 (±0.015)
kernel: poly 0.952 (±0.041)
C: 1, gamma: 0.001, kernel: rbf 0.741 (±0.056)
C: 1, gamma: 0.0001, kernel: rbf 0.143 (±0.002)
C: 100, gamma: 0.001, kernel: rbf 0.986 (±0.016)
C: 100, gamma: 0.0001, kernel: rbf 0.900 (±0.053)
Adaboost n_estimators: 25 0.890 (±0.071)
n_estimators: 50 0.910 (±0.041)
n_estimators: 100 0.921 (±0.031)
KNN n_neighbors: 3 0.980 (±0.022)
n_neighbors: 25 0.901 (±0.068)
n_neighbors: 100 0.801 (±0.049)
ANN layer: 2, activation: relu 0.985
layer: 2, activation: linear 0.982
layer: 3, activation: relu 0.980
layer: 3, activation: linear 0.422
DistilBERT Random Forest n_estimators: 10 0.970 (±0.027)
n_estimators: 25 0.977 (±0.019)
n_estimators: 50 0.987 (±0.013)
n_estimators: 100 0.984 (±0.019)
Naïve Bayes default 0.827 (±0.050)
SVM kernel: linear 0.969 (±0.019)
kernel: poly 0.912 (±0.059)
C: 1, gamma: 0.001, kernel: rbf 0.143 (±0.002)
C: 1, gamma: 0.0001, kernel: rbf 0.143 (±0.002)
C: 100, gamma: 0.001, kernel: rbf 0.976 (±0.020)
C: 100, gamma: 0.0001, kernel: rbf 0.834 (±0.056)
Adaboost n_estimators: 25 0.874 (±0.058)
n_estimators: 50 0.916 (±0.026)
n_estimators: 100 0.924 (±0.027)
KNN n_neighbors: 3 0.983 (±0.021)
n_neighbors: 25 0.931 (±0.033)
n_neighbors: 100 0.836 (±0.047)
ANN layer: 2, activation: relu 0.983
layer: 2, activation: linear 0.981
layer: 3, activation: relu 0.979
layer: 3, activation: linear 0.975

13
4674 J.-W. Chang et al.

Table 6  Classification accuracy Classifiers embeddings Random forest Naïve Bayes SVM Adaboost KNN ANN Average
of fraud detection
Word2vec 91.1% 74.2% 81.3% 82.4% 80.5% 90.9% 83.4%
(±0.1) (±0.1) (±0.15) (±0.07) (±0.09)
ELMO 96.5% 81.4% 97.9% 89.5% 95.5% 97.9% 93.1%
(±0.02) (±0.06) (±0.01) (±0.04) (±0.03)
BERT 98.2% 74.9% 98.6% 92.1% 98% 98.5% 93.3%
(±0.01) (±0.04) (±0.01) (±0.03) (±0.02)
DistilBERT 98.7% 82.7% 97.6% 92.4% 98.3% 98.3% 94.6%
(±0.01) (±0.05) (±0.02) (±0.02) (±0.02)
Average 96.1% 78.3% 93.8% 89.1% 93% 96.4% –

Table 7  Execution time of fraud Classifiers Embeddings Random Naïve Bayes SVM Adaboost KNN ANN Average
detection models (unit: second) Forest

Word2vec 5.24 3.64 4.72 7.19 3.78 34.91 9.91


ELMO 19.12 15.95 20.71 29.42 17.45 277.8 63.4
BERT 401.19 403.14 408.19 408.93 411.03 492.32 420.8
DistilBERT 199.82 201.14 203.81 199.23 194.97 324.91 220.64
Average 156.34 155.96 159.35 161.19 156.8 282.48 –

Table 8  Confusion matrix for the best fraud detection model more detailed information but people are without any finan-
(DistiBERT+Random Forest) cial damages. In contrast, false-negative cases can cause
Prediction actual Legal Illegal Pending people to be deceived and cause financial losses. In Table 8,
confirma- most error cases of the proposed model are false-positive
tion cases and no one illegal event was classified into a legal
Legal 125 0 3 event. The results show that the proposed model can address
Illegal 0 187 3 the most important research problem for fraud detection.
Pending confirmation 0 1 138

4.3 Fraud‑event classification
Table 9  Classification results for the best fraud detection model
(DistiBERT+Random Forest) The study uses a grid search to optimize the best accuracy
Precision Recall F1-score for each combination and finds the best parameters with the
highest accuracy, as shown in Table 10.
Legal 1.00 0.98 0.99 When the fraud-event detection model classifies the user's
Pending confirmation 0.96 0.99 0.98 query context as illegal, the system should accurately pro-
Illegal 0.99 0.98 0.99 vide the most appropriate response to the user. The fraudu-
Accuracy 0.98 lent event-classification model must be able to identify the
type of illegal event from the user query. Therefore, the four
semantic extraction methods and the six classification mod-
had an accuracy of 98%. Although the proposed model els were used to train the best fraud detection and fraud event
achieved excellent performance in fraud-detection, we fur- classification models in this study. As shown in Table 11,
ther do the error analysis by Table 8. In Table 8, the number BERT with SVM has the highest accuracy (98.4%).
of false-positive to be identified is relatively high in the cases An experiment was conducted to verify the execution
of "Pending Confirmation" so that the precision of "Pending time for all the classification combinations. The results are
Confirmation" to be identified is low. However, the sensitiv- presented in Table 12. The average time for DistilBERT was
ity of the fraud-detection model is more important so that also significantly lower than that for BERT.
the error cases classified into false-positive are better than An evaluation of the BERT paired with an SVM was
false-negative. Because more false-negative cases lower the conducted for the fraud-classification model. The confusion
sensitivity of the fraud-detection model. In the false-positive matrix for classification is shown in Table 13. It was found
cases, another round of dialogue is activated for collecting that only two out of the seven fraudulent behaviors were

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4675

Table 10  Parameters Word embedding Classifier Parameters Accuracy


optimization for fraud-event
classification by grid search Word2vec Random Forest n_estimators: 10 0.892 (±0.077)
according to accuracy
n_estimators: 25 0.931 (±0.068)
n_estimators: 50 0.937 (±0.072)
n_estimators: 100 0.938 (±0.077)
Naïve Bayes none 0.468 (±0.165)
SVM kernel: linear 0.305 (±0.059)
kernel: poly 0.378 (±0.217)
C: 1 m, gamma: 0.001, kernel: rbf 0.143 (±0.000)
C: 1, gamma: 0.0001, kernel: rbf 0.143 (±0.000)
C: 100, gamma: 0.001, kernel: rbf 0.227 (±0.041)
C: 100, gamma: 0.0001, kernel: rbf 0.143 (±0.000)
Adaboost n_estimators: 25 0.541 (±0.268)
n_estimators: 50 0.718 (±0.249)
n_estimators: 100 0.841 (±0.109)
KNN n_neighbors: 3 0.766 (±0.129)
n_neighbors: 25 0.515 (±0.153)
n_neighbors: 100 0.244 (±0.115)
ANN layer: 2, activation: relu 0.891
layer: 2, activation: linear 0.897
layer: 3, activation: relu 0.934
layer: 3, activation: linear 0.959
ELMO Random Forest n_estimators: 10 0.799 (±0.097)
n_estimators: 25 0.836 (±0.069)
n_estimators: 50 0.878 (±0.064)
n_estimators: 100 0.898 (±0.049)
Naïve Bayes none 0.808 (±0.089)
SVM kernel: linear 0.911 (±0.069)
kernel: poly 0.862 (±0.063)
C: 1, gamma: 0.001, kernel: rbf 0.843 (±0.068)
C: 1, gamma: 0.0001, kernel: rbf 0.246 (±0.100)
C: 100, gamma: 0.001, kernel: rbf 0.917 (±0.066)
C: 100, gamma: 0.0001, kernel: rbf 0.908 (±0.072)
Adaboost n_estimators: 25 0.595 (±0.228)
n_estimators: 50 0.708 (±0.123)
n_estimators: 100 0.703 (±0.178)
KNN n_neighbors: 3 0.882 (±0.065)
n_neighbors: 25 0.777 (±0.108)
n_neighbors: 100 0.609 (±0.171)
ANN layer: 2, activation: relu 0.925 (±0.046)
layer: 2, activation: linear 0.912 (±0.058)
layer: 3, activation: relu 0.916 (±0.044)
layer: 3, activation: linear 0.905 (±0.038)

13
4676 J.-W. Chang et al.

Table 10  (continued) Word embedding Classifier Parameters Accuracy

BERT Random Forest n_estimators: 10 0.929 (±0.048)


n_estimators: 25 0.949 (±0.048)
n_estimators: 50 0.957 (±0.058)
n_estimators: 100 0.962 (±0.044)
Naïve Bayes none 0.815 (±0.123)
SVM kernel: linear 0.984 (±0.035)
kernel: poly 0.930 (±0.066)
C: 1, gamma: 0.001, kernel: rbf 0.239 (±0.017)
C: 1, gamma: 0.0001, kernel: rbf 0.032 (±0.002)
C: 100, gamma: 0.001, kernel: rbf 0.980 (±0.256)
C: 100, gamma: 0.0001, kernel: rbf 0.956 (±0.046)
Adaboost n_estimators: 25 0.662 (±0.156)
n_estimators: 50 0.751 (±0.184)
n_estimators: 100 0.815 (±0.090)
KNN n_neighbors: 3 0.952 (±0.043)
n_neighbors: 25 0.843 (±0.098)
n_neighbors: 100 0.689 (±0.209)
ANN layer: 2, activation: relu 0.970
layer: 2, activation: linear 0.968
layer: 3, activation: relu 0.979
layer: 3, activation: linear 0.972
DistilBERT Random Forest n_estimators: 10 0.945 (±0.053)
n_estimators: 25 0.965 (±0.030)
n_estimators: 50 0.967 (±0.038)
n_estimators: 100 0.974 (±0.042)
Naïve Bayes none 0.833 (±0.072)
SVM kernel: linear 0.954 (±0.057)
kernel: poly 0.760 (±0.120)
C: 1, gamma: 0.001, kernel: rbf 0.143 (±0.000)
C: 1, gamma: 0.0001, kernel: rbf 0.143 (±0.000)
C: 100, gamma: 0.001, kernel: rbf 0.972 (±0.051)
C: 100, gamma: 0.0001, kernel: rbf 0.513 (±0.045)
Adaboost n_estimators: 25 0.688 (±0.035)
n_estimators: 50 0.804 (±0.055)
n_estimators: 100 0.832 (±0.045)
KNN n_neighbors: 3 0.957 (±0.032)
n_neighbors: 25 0.894 (±0.055)
n_neighbors: 100 0.759 (±0.091)
ANN layer: 2, activation: relu 0.975
layer: 2, activation: linear 0.968
layer: 3, activation: relu 0.974
layer: 3, activation: linear 0.981

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4677

Table 11  Classification Classifiers Embeddings Random Naïve Bayes SVM Adaboost KNN ANN Average
accuracy of the fraud event Forest
classification models
Word2vec 93.8% 46.8% 37.8% 84.1% 76.6% 95.9% 72.5%
(±0.07) (±0.16) (±0.21) (±0.1) (±0.12)
ELMO 89.8% 80.8% 91.7% 70.8% 88.2% 92.5% 85.6%
(±0.04) (±0.08) (±0.06) (±0.12) (±0.06)
BERT 96.2% 81.5% 98.4% 81.5% 95.2% 97.9% 91.3%
(±0.04) (±0.12) (±0.03) (±0.09) (±0.04)
DistilBERT 97.4% 83.3% 97.2% 83.2% 95.7% 98.1% 92.4%
(±0.04) (±0.07) (±0.05) (±0.04) (±0.03)
Average 94.3% 73.1% 81.2% 79.9% 88.9% 95.4% –

Table 12  Execution time of Classifiers Embeddings Random forest Naïve Bayes SVM Adaboost KNN ANN Average
the fraud event classification
models (unit: second) Word2vec 2.9 4.19 4.72 3.53 3.78 36.31 9.23
ELMO 20.35 16.44 15.25 19.89 14.89 274.48 60.21
BERT 218.76 203.06 206.89 208.81 205.26 363.14 234.32
DistilBERT 135.94 111.98 112.74 115.87 111.42 251.3 139.87
Average 94.48 83.91 84.9 87.02 83.83 231.3 –

Table 13  Confusion matrix for Prediction actual Type 1 Type 2 Type 3 Type 4 Type 5 Type 6 Type 7
best fraud event-classification
model (BERT+SVM) Type 1 20 0 0 0 1 0 0
Type 2 0 20 0 0 0 0 0
Type 3 0 0 10 0 0 0 0
Type 4 0 0 0 19 0 0 9
Type 5 0 1 0 0 35 0 0
Type 6 0 0 0 0 0 47 0
Type 7 0 0 0 0 0 0 44

Table 14  Classification results of the best fraud event-classification when it has already been determined to be fraud and then
model (BERT+SVM) provides different ways of responding to it, regardless of
Precision Recall F1-score the type of fraudulent behavior, the response is “Judged
to be fraud” and “Recommended to use the hotline of 165
Type 1 1.00 0.95 0.98 National Fraud Prevention Network.” Therefore, this study
Type 2 0.95 1.00 0.98 concludes that misclassification is not likely to cause a seri-
Type 3 1.00 1.00 1.00
ous financial loss when the user has been explicitly informed
Type 4 1.00 1.00 1.00
of the fraud. In addition, it can be seen that the precision and
Type 5 0.97 0.97 0.97
recall of the fraud-classification model are equally impor-
Type 6 1.00 1.00 1.00
tant, and it is necessary to increase the value of both.
Type 7 1.00 1.00 1.00
Accuracy 0.99
5 Conclusion

misclassified, which falls within the average tolerance range The proposed finance fraud-awareness model applies fea-
of misclassification. ture extraction and classification techniques to process
This study evaluates the classification results for the user-generated content and supports the identification of
fraud-event classification model, as shown in Table 14. The currently known finance-fraud cases. The experimental
overall accuracy of the model is 99%. Since the fraud event- results show that the accuracy of the fraud detection model
classification model further classifies known fraud behavior reaches 98.7% using DistilBERT with random forest that of

13
4678 J.-W. Chang et al.

the fraud-classification model reaches 98.4% using BERT The results verify that DistilBERT was more suitable as a
with a SVM. text feature-extraction method than BERT was in this study
Interactions between users and chatbots are instantane- and support the choice of DistilBERT as a feature-extraction
ous, shortening the response time and maintaining a certain method for both models in real applications.
accuracy, thus becoming an important issue. In this study, In terms of classifiers, it was found that the more categories
with respect to the real-world use scenario, users’ network we have, the longer the classification will take because the ran-
connection status and hardware device situation were not dom forest is a tree-based classification method. Therefore, when
considered. Hence, the question is how high the accuracy considering efficiency, it was concluded that the size of the cat-
of the proposed model can reach if the above-mentioned egories may decide whether the random forest method or SVM
factors during the interaction are not considered. Accord- method should be applied. According to the experimental results,
ing to the experimental results, the accuracies of BERT it is known that the detection model for classifying three scenarios
and DistilBERT were comparable. However, the accuracy and the classification model for classifying seven fraud scenarios
of DistilBERT is more suitable for text-feature extraction also use random forest and SVM, respectively, indicating that the
owing to efficiency concerns. In addition, Sanh et al. (2019) efficiencies can be excellent in fraud detection and classification.
indicated that the DistilBERT not only achieves an accuracy Therefore, the entire research framework can use an efficient and
of 97% but also requires fewer parameters (e.g., approxi- effective method from text-feature extraction to text classification.
mately 40%). Moreover, DistilBERT achieves a higher pro-
cessing speed (approximately 60% and above) than that of
BERT, which is 30% smaller and 83% faster than ELMO
is. Therefore, DistilBERT could achieve both high perfor- Appendix
mance and efficiency with fewer parameters and faster speed
while maintaining the expected performance in this study. See Table 15.

Table 15  Sample Fraud Events and Categories


Major Category Secondary Category Samples

Legal legal organization 我昨天收到一個交通部送來的裁罰通知


+ (I received a fine ticket from ministry of transport.)
regular event of this organization
legal organization 朋友昨天去附近小七買遊戲點數
+ (My friend bought the game points at 7-ELEVEN yesterday.)
regular behavior to this organization
Illegal suspected person 昨天電話中有人要求我使用ATM解除分期付款
+ (Someone asked me to cancel the installment via ATM yesterday in a phone call.)
regular behavior
suspected person 有自稱是OO銀行的稽核員要求我將伍萬新臺幣匯給他, 他就不會凍結我的銀行
+ 帳戶。
legal organization (Someone claims to be an auditor from OO bank who asked me to remit the $50,000
NT to him, and then he will not freeze my bank account.)
suspected organization 我沒有在OO商家買過東西, 但該店店員說我買了某商品所以得到了額外的獎
+ 金。
regular behavior (I haven't bought things at the OO store, but the staff of this store said that I bought
something and won extra prizes.)
suspected organization 我從未聽過OO保險公司, 但是他們的業務主任卻說我這邊有一張過期的保單。
+ (I have never heard of OO insurance company, but the business director of this com-
legal person pany said that I have an expired insurance policy here.)
suspected behavior to this organization 我朋友收到一個交通部裁罰但是要用遊戲點數繳交唉
+ (My friend received a notification saying that she can pay traffic fine ticket by game
legal organization points.)
suspected behavior to this person 今天早上有個警察到我家來, 要求我將信用卡交給他保管
+ (A policeman came to my house this morning and asked me to hand over my credit
legal person card to him for safekeeping.)
other fraud activities 之前看到抽獎活動, 填完個人資料後, 客服就通知我中獎, 只要匯款繳交保證金
就好, 然後我就聯絡不上賣家
(I cannot contact the seller after being told to pay for guarantee fees for a lottery.)
Pending without critical information 我今天早上收到一則簡訊通知
Confirmation (I received a text message this morning.)

13
Design of a NLP‑empowered finance fraud awareness model: the anti‑fraud chatbot for fraud… 4679

Acknowledgements Special thanks to Mr. Hou-Hsun Wang for his study of machine learning methods. Knowledge-Based Syst
assistance in the development of the programming for this study. 128:139–152
2019 Internet Crime Report Released (2019) https://w ​ ww.f​ bi.g​ ov/n​ ews/​
Funding This work was partially supported by the Ministry of stori​es/​2019-​inter​net-​crime-​report-​relea​sed-​021120. Accessed 28
Science and Technology, Taiwan, R.O.C. [grand number MOST May 2020.
108-2218-E-025-002-MY3]. Jurgovsky J, Granitzer M, Ziegler K, Calabretto S, Portier PE, He-
Guelton L, Caelen O (2018) Sequence classification for credit-
card fraud detection. Expert Syst Appl 100:234–245
Availability of data and material Data, experiment dataset as well, is
Lilleberg J, Zhu Y, Zhang Y (2015) Support vector machines and word-
accessible per reasonable request.
2vec for text classification with semantic features. 2015 IEEE 14th
International Conference on Cognitive Informatics & Cognitive
Code Availability Code will be available per reasonable request after Computing (ICCI*CC), pp 136–140, 6–8 Jul 2015, Beijing, China.
the patent is filed. Ling M, Chen Q, Sun Q, Jia Y (2020) Hybrid neural network for
Sina Weibo sentiment analysis. IEEE Trans Comput Social Syst
Declarations 7(4):983–990
Martina M, Wu JR (2016) China blames Taiwan criminals for surge
in telephone scams. Reuters. https://​www.​reute​rs.​com/​artic​le/​us-​
Conflicts of interest Authors of this work declare that there is no con-
china-t​ eleco​ ms-f​ raud-i​ dUSKC​ N0XJ0​ 22. Accessed 22 April 2016.
flict of interest/competing interests.
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estima-
tion of word representations in vector space. arXiv preprint
Consent for participate Authors are aware of everything related to this
arXiv:1301.3781.
submitted work.
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K,
Zettlemoyer L (2018) Deep contextualized word representa-
Consent for publication Authors are aware of the submitted work for
tions. arXiv preprint arXiv:1802.05365.
publication on Journal of Ambient Intelligence and Humanized Com-
Rexha A, Dragoni M, Kern R (2020) A Neural-based Architecture
puting.
For Small Datasets Classification. In: Proceedings of the ACM/
IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20),
pp 319–327, 1–5 Aug 2020, China.
Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled
References version of BERT: smaller, faster, cheaper and lighter. arXiv pre-
print arXiv:1910.01108.
Adewumi AO, Akinyelu AA (2017) A survey of machine-learning and Sun C, Yang Z, Luo L, Wang L, Zhang Y, Lin H, Wang J (2019) A
nature-inspired based credit card fraud detection techniques. Int J deep learning approach with deep contextualized word representa-
Syst Assur Eng Manage 8:937–953 tions for chemical-protein interaction extraction from biomedical
Aggarwal A, Chauhan A, Kumar D, Mittal M, Verma S (2020) Clas- literature. IEEE Access 7:151034–151046
sification of fake news by fine-tuning deep bidirectional trans- Wen TH, Gasic M, Mrksic N, Su PH, Vandyke D, Young S (2015)
formers based language model. EAI Endorsed Trans Scalable Inf Semantically conditioned lstm-based natural language generation
Syst 7(27):1–12 for spoken dialogue systems. arXiv preprint arXiv:1508.01745.
Bocklisch T, Faulkner J, Pawlowski N, Nichol A (2017) Rasa: Open Wensen L, Zewen C, Jun W, Xiaoyi W (2016) Short text classification
source language understanding and dialogue management. arXiv based on Wikipedia and Word2vec. 2016 2nd IEEE International
preprint arXiv:1712.05181. Conference on Computer and Communications (ICCC), pp 1195–
Chen LC, Hsu CL, Lo NW, Yeh KH, Lin PH (2017) Fraud analysis 1200, 14–17 Oct 2016, Chengdu, China.
and detection for real-time messaging communications on social
networks. IEICE Trans Inf Syst 100:2267–2274 Publisher's Note Springer Nature remains neutral with regard to
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of jurisdictional claims in published maps and institutional affiliations.
deep bidirectional transformers for language understanding. arXiv
preprint arXiv:1810.04805.
Hajek P, Henriques R (2017) Mining corporate annual reports for
intelligent detection of financial statement fraud–A comparative

13

You might also like