Paper 3-Credit Card Fraud Detection Using Deep Learning

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 9, No. 1, 2018
Credit Card Fraud Detection using Deep Learning

based on Auto-Encoder and Restricted Boltzmann
Machine
Apapan Pumsirirat, Liu Yan
School of Software Engineering, Tongji University
Shanghai, China
Abstract—Frauds have no constant patterns. They always a system of supervised learning and anomaly detection system
change their behavior; so, we need to use an unsupervised a system of unsupervised learning. What is the difference
learning. Fraudsters learn about new technology that allows them between supervised learning and unsupervised learning? The
to execute frauds through online transactions. Fraudsters assume answer is that supervised learning studies labeled datasets.
the regular behavior of consumers, and fraud patterns change They use labeled datasets to train and to render it accurate by
fast. So, fraud detection systems need to detect online transactions changing the parameters of the learning rate. After that, they
by using unsupervised learning, because some fraudsters commit apply parameters of learning rate to the dataset, the techniques
frauds once through online mediums and then switch to other that implement supervised learning such as multilayer-
techniques. This paper aims to 1) focus on fraud cases that cannot
perceptron (MLP) to build the model based on the history of
be detected based on previous history or supervised learning,
2) create a model of deep Auto-encoder and restricted Boltzmann
the database. This supervised learning has a disadvantage,
machine (RBM) that can reconstruct normal transactions to find since if new fraud transactions happen that do not match with
anomalies from normal patterns. The proposed deep learning the records of the database, then this transaction will be
based on auto-encoder (AE) is an unsupervised learning considered genuine. While, unsupervised learning acquires
algorithm that applies backpropagation by setting the inputs information from new transactions and finds anomalous
equal to the outputs. The RBM has two layers, the input layer patterns from new transaction. This unsupervised learning is
(visible) and hidden layer. In this research, we use the Tensorflow more difficult than supervised learning, because we have to use
library from Google to implement AE, RBM, and H2O by using appropriate techniques to detect anomalous behavior.
deep learning. The results show the mean squared error, root
mean squared error, and area under curve.
Neural networks were introduced to detect credit card
frauds in the past. Now, we focus on deep learning that is a
Keywords—Credit card; fraud detection; deep learning; subfield of machine learning (ML). Based on deep learning in
unsupervised learning; auto-encoder; restricted Boltzmann the first period, they use deep learning to know about an
machine; Tensorflow image’s processing. For example, Facebook uses deep learning
in the function to tag people and to know who the person is for
I. INTRODUCTION subsequent reference. Further, deep learning in neural networks
Fraud detection in online shopping systems is the hottest have many algorithms for use in fraud detection, but in this
topic nowadays. Fraud investigators, banking systems, and paper, we selected the AE and RBM to detect whether normal
electronic payment systems such as PayPal must have an transaction of datasets qualified as novel frauds. We believe
efficient and complex fraud detection system to prevent fraud that some normal transaction in datasets that were labeled as
activities that change rapidly. According to a CyberSource fraud also show suspicious transaction behavior. So, in this
report from 2017, the present fraud loss by order channel, that paper we focus on unsupervised learning.
is, the percentage of fraud loss in their web store was 74 In this paper, we use three datasets in these experiments;
percent and 49 percent in their mobile channels [1]. Based on these datasets are the German, Australian, and European
this information, the lesson is to determine anomalies across datasets [4], [3], [18]. The first dataset is German, provided by
patterns of fraud behavior that have undergone change relative Professor Dr. Hans Hofman [4]. There are twenty attributes
to the past. that describe the capability, such as credit history, purpose to
A good fraud detection system should be able to identify use credit card, credit amount, job, among others. The German
the fraud transaction accurately and should make the detection dataset were 1000 instances. The second dataset is from
possible in real- time transactions. Fraud detection can be Australia. [3] The attributes’ names and values in this dataset
divided into two groups: anomaly detection and misuse have been changed to meaningless symbols to protect the
detection [2]. Anomaly detection systems bring normal confidentiality of the data. There were 690 instances. The last
transaction to be trained and use techniques to determine novel dataset was from a European cardholder from September 2013.
frauds. Conversely, a misuse fraud detection system uses the This dataset shows the transaction that occurred in two days
labeled transaction as normal or fraud transaction to be trained with 284, 807 transactions. There were 31 features in this
in the database history. So, this misuse detection system entails dataset. The 28 features, such as V1, V28 is a numerical input
variable result of PCA transformation. Other 3 feature that do
18 | P a g e
www.ijacsa.thesai.org
Vol. 9, No. 1, 2018
not bind with PCA are “Time”, “Amount”, and “Class”. This problems. The main idea of ANN is mimicking the learning
experiment will bring together three datasets to compare algorithm of the human brain. The smallest unit of ANN is
different receiver operating characteristics (ROC) to called a perceptron, is represented as a node. Several
understand the performance of binary classifiers. perceptrons are connected as a network like the human brain.
Each node has a weighed communication with several other
II. RELATED WORK nodes in the adjacent layer. A weight is simply a floating-point
In the past decade, credit card was introduced in the number, and it can be adjusted when the input eventually
financial segment. Now, credit card has become a popular comes to train the network. Inputs are passed from input nodes
payment method in online shopping for goods and services. through hidden layers to output nodes. Each node can learn and
Since the introduction of credit cards, fraudsters have tried to adjust itself to make it more accurate and appropriate.
falsely adopt normal behavior of users to make their own The problem of credit card fraud detection has been
payments. Due to these problems, most research on credit card analyzed with the Chebyshev Function Link Artificial Neural
fraud detection has focused on pattern matching in which Network (CFANN). CFANN consists of two components,
abnormal patterns are identified as distinct from normal functional expansion and learning. Mukesh Kumar Mishra and
transactions. Many techniques for credit card fraud detection Rajashree Dash, authors who used CFANN to detect credit
have been presented in the last few years. We will briefly card fraud by comparing it with MLP, and the Decision Tree
review some of those techniques below. [7]. MLP infers that the topology was structured into a number
The K-nearest neighbor (KNN) algorithms are used to of layers. The first layer is called input layer, the middle layer
detect credit card frauds. This technique is a supervised is called the hidden layer. This layer can have more than one
learning technique. KNN is used for classification of credit layer, and the last layer is called the output layer. Feed forward
card fraud detection by calculating its nearest point. If the new infers that all information flows in the same direction, the left-
transaction is coming and the point is near the fraudulent to-right direction, without recurrent links. Decision Tree is a
transaction, KNN identifies this transaction as a fraud [5]. structured tree that has a root node and a number of internal
Many people confuse KNN with K-means clustering, whether and leaf nodes. Their paper compares the performance of
they are the same techniques or not. K-means and KNN are CFANN, MLP, and Decision Tree. The result of their study
different. K-means is an unsupervised learning technique, used suggests that MLP outperforms CFANN and Decision Tree in
for clustering. K-Means tries to determine new patterns from fraud detection. Conversely, CFANN makes accurate
the data and by clustering the data into groups. Conversely, predictions over the other two techniques [7].
KNN is the number used to compare the nearest neighbor to Deep learning forms a state of the art technology in the
classify or predict a new transaction based on previous history. present day. Most people in IT should follow this. First, ANN
The distance in KNN between two data instances can be was introduced. After that, ML becomes a subset of ANN, and
calculated by using different method, but mostly by using the deep learning, a subfield of ML. Deep learning has been used
Euclidean distance. KNN is very useful. in many fields such as image recognition in Facebook, speech
The outlier detection is another method used to detect both recognition in Apple or Siri, and natural language processing in
supervised and unsupervised learning. Supervised outlier Google translator. Yamini Pandey used deep learning with the
detection method studies and classifies the outlier using the H2O algorithm framework to know complex patterns in the
training dataset. Conversely, unsupervised outlier detection is dataset. H2O is an open source for predictive data analytics on
similar to clustering data into multiple groups based on their Big Data. Supervised learning is based on predictive analytics.
attributes. N. Malini and Dr. M. Pushpa mention that the The author used H2O based multi-layered, feed forward neural
outlier detection method based on unsupervised learning is network to find credit card fraud patterns. H2O’s performance
preferred to detect credit card fraud over outlier supervised based on the deep learning model shows less error in mean
learning, because unsupervised learning outlier does not squared error, root mean squared error, mean absolute error,
require prior information to label data as fraudulent. So, it and root mean squared log error. Hence, these errors are low
needs to be trained by using normal transactions to that enhances accuracy. The model’s accuracy is also high in
discriminate between a legal or illegal transaction [5]. relation to the errors mentioned above [8]. Another concern
before registering credit cards is credit cards’ analysis’
Some credit card fraud transaction datasets contain the judgement. Ayahiko Niimi uses deep learning to judge whether
problem of imbalance in datasets. Anusorn Charleonnan a credit card should be issued to the user if they satisfy
mentions that the unbalance of datasets has many particular criteria. Transaction judgement refers to the validity
characteristics that emerge during the classification. He uses of a transaction’s attributes before making the decisions. To
RUS, a data sampling technique, by trying to relieve the verify the transaction, the author uses the benchmark
problem of class unbalance by editing the class distribution of experiment based on deep learning and confirms that the result
training datasets. There are two major methods of adjusting the of deep learning has similar accuracy as the Gaussian kernel
imbalance in datasets, undersampling and oversampling. In his SVM. For the comparison, the authors use five typical
research, he also uses the MRN algorithm for the classification algorithms and change the parameters of deep learning for five
problem of credit card fraud [6]. times, such as activation function and dropout parameter [9].
Artificial neural network (ANN) is a flexible computing

framework used to solve a comprehensive range of non-linear
19 | P a g e
Vol. 9, No. 1, 2018
III. DEEP LEARNING TECHNIQUE FOR DETECT CREDIT

CARD FRAUD
Deep learning is the state of the art technology that recently
attracted the IT circle’s considerable attention. The deep
learning principle is an ANN that has many hidden layers.
Conversely, non-deep learning feed forward neural networks
have only a single hidden layer. The given picture shows the
comparison between non-deep learning as in Fig. 1 and deep
learning with hidden layers as in Fig. 2.
Now, we know about ANN, ML, and Deep Learning (DL).
If these three words are metaphorically equated with the human
body, they would be comparable as follows: artificial
intelligence is like the body that contains the traits of Fig. 2. Deep neural network [11].
intelligence, reasoning, communication, emotions, and feeling.
ML is like one system that acts in the body, especially the
visual system. Finally, deep learning is comparable to the
visual signaling mechanism. It consists of a number of cells,
such as retina that acts as a receptor and translates light signals
into nerve signals. Now, we shall compare all the three
categories with the human body.
Deep learning is a generic term used for multilayer neural
network. Based on deep learning, there are many algorithms to
implement such as AE, deep convolutional network, support
vector machine, and others. One problem in selecting the
algorithm to solve the problem is that the developer should
know the real problem and what each algorithm in deep
learning does. The three algorithms of deep learning that do
unsupervised learning are RBM, AE, and the sparse coding
model. Unsupervised learning automatically extracts the
meaningful features of your data, leverages the availability of
unlabeled data, and adds a data-dependent regularization for
training.
In this study, we use AE for credit card fraud detection. AE
has the input equal to the output in the hidden layer that has Fig. 3. Auto-encoder [12].
more or less the kind of input units depicted in the Fig. 3.
The equation of an encoder and a decoder are presented
here:
Encoder
=
= or
=
Decoder
̂ = ̂
= or
=
In this study to implement AE, we use the hyperbolic
tangent function or “tanh” function to encode and decode the
input to the output. As a sample of a neural network, when we
have already used the AE model, we should reconstruct the
error by using backpropagation. Backpropagation computes the
“error signal”, propagates the error backwards through network
that starts at the output units by using the condition that the
error forms the difference between the actual and desired
output values. Based on the AE, we use parameter gradients for
Fig. 1. Single layer hidden neural network [10]. realizing backpropagation.
20 | P a g e
Vol. 9, No. 1, 2018
behavior to be trained first, and then uses the new coming

transaction as a validation test for the transaction. AE does not
use labeled transactions to be trained, because it is
unsupervised learning. RBM uses all transactions that transfer
from acquiring bank as visible input and then that goes to the
hidden node, and after the calculation of the activation
function, the RBM reconstructs the model by transferring the
new input from the activation function back to the output or
visible function. As a conclusion of this in Fig. 5, if the
transaction is fraudulent, the system will record this transaction
as a fraud in the database and will then reject it. Next, the
acquiring bank sends a SMS alert to the real consumer that the
transaction has not been processed, because the system
suspects the transaction as fraudulent.
Fig. 4. Reconstruction of RBM [17].
IV. COMPARATIVE FRAUD DETECTION TECHNIQUES
Before focusing on the study of AE and RBM, this paper
would prefer to compare it with other techniques to show that
deep learning is suitable for finding anomalous patterns against
normal transactions in Table I.
TABLE I. COMPARISON OF FRAUD DETECTION TECHNIQUE
Fraud Detection
Advantage Disadvantage
Techniques
KNN method can be KNN method is

K-nearest used to determine suitable for detecting
Neighbor anomalies in the target frauds with the
Algorithm instance and is easy to limitations of
implement. memory.
HMM can detect the

HMM cannot detect
Hidden Markov fraudulent activity at
fraud with a few
Model (HMM) the time of the
transactions.
transaction.
Fig. 5. Credit card fraud detection system using deep leaning.

Neural networks have
Another algorithm is RBM. There are two structures in this many sub-techniques.
Neural networks have
algorithm, visible or input layer and hidden layer. Each input learned the previous So, if they pick-up
node takes the input feature from the dataset to be learned. The behavior and can this which is not
design is different from other deep learning, because there is no Neural Network detect real-time credit suitable for credit
card frauds. card fraud detection,
output layer. The output of RBM is getting the reconstruction the performance of
back to the input as shown in the picture below or Fig. 4. The the method will
point of RBM is the way in which they learn by themselves for decline.
data reconstruction; this is unsupervised learning.
Let us proceed to our design of credit card fraud detection Decision Tree have
system by using deep learning between AE and RBM in Fig. 5. many type of input
feature, DT can be
First, the consumer orders the product via internet by using the constructed using
credit card payment method. After that, the issuing bank sends different induction
the transaction to the acquiring bank by sending the amount of Decision Tree can algorithm like ID3,
money, date and time of payment, location of internet usage, Decision Tree
handle non-linear C4.5 and CART. So,
and more. Now, this is the credit card fraud detection system credit card transaction the cons are how to
used to validate the behavior of credit card. As you can see, the as well. bring up induction
algorithm to detect
credit card fraud system requests consumer’s profile from the fraud as well. DT
database to bring their behavior into the AE and RBM by using cannot detect fraud at
deep learning. Based on the AE, the acquiring bank transfers the real time of
the input that is the amount of money, date and time, location transaction.
of internet use, and other information. Then, the AE uses past
21 | P a g e
Vol. 9, No. 1, 2018
Outlier detection
detects the credit card TABLE III. AUTOENCODER MODEL USING H2O
fraud with lesser
Outlier detection
memory and
Outlier Detection cannot find
computation Autoencoder =
Method anomalies accurately
requirements. This h2o.estimators.deeplearning.H2OAutoEncoderEstimator(hidden=hidden_stru
like other methods.
method works fast and cture, epochs=200, activation='tanh', autoencoder=True)
well for large online
datasets. Autoencoder.train(x = Input,
Training_frame = data_set)
Now, deep learning
Print(Autoencoder)
A key advantage of is widely used in
deep learning is the image recognition.
analysis and learning No information to
Deep Learning of a massive amount of explain the other In the Keras method, we designed 6 hidden layers by
unsupervised data. It domains is available. having 3 encoders and 3 decoders. In each hidden layer, we
can extract complex The library of deep
patterns [13]. learning does not
designed the following units:
cover all algorithms. Input : 21 attributes or 21 Input
V. PROPOSED METHOD Encoder1 (H1) : 16
In this paper, we use Keras [15] as a high-level neural Encoder2 (H2) :8
network API implemented by python. Another program that
we implement in AE is H2O [16] package. We use the H2O Encoder3 (H3) :4
package to find MSE, RMSE, and variable importance across Decoder3 (H1) :4
each attribute of the datasets. Conversely, we used Keras in
parallel processing to get AUC and confusion matrix. Both Decoder2 (H2) :8
frameworks, we coded in python on Jupyterlab. Decoder3 (H3) : 21
Before we could develop the program AE by using Keras Output : 21
API and code the program AE by using H2O, the datasets
needed to be cleansed. As we know, the German credit card As mentioned above, every hidden layer we used was the
data set and the Australian dataset classified characteristics for “Tanh” activation function. In Keras, there are many activation
each attribute. You can see the details of these attributes in functions to implement. Based on the experiment, we used
[3], [4]. “Tanh” function, because it achieves a high level of AUC. We
divide the train and test with 80 and 20 percentage of data by
This is the step of cleansing data. using normal transactions to predict fraudulent transactions.
1) Classified the data into a number of classifications such This is an example of Python Coding in Keras as in
as attribute 4 (qualitative) purpose Table II.
A40: Car (new) As you can see, in Keras API, we need to build our model
by preparing the command ourselves. Conversely, in the H2O
A41: Car (used)
package, we use the command of AE in Table III.
A410: others
Base on methodology of our research, we coded in Python
We transform it to the number of classifications, such as and then we used Area of Under Curve to identify the success
A40 = 1, A41 =2, …, A410 = 10 and so on. rate of the model. If the percentage of AUC is high then mean
that we found unsupervised learning rate with true positive rate
2) After obtaining the classification for each attribute, we on our model. Conversely, some datasets that has less amount
transform those classifications into PCA by using of data will get more false positive rate because they has not
XLSTAT [14]. much data to be trained.
TABLE II. AUTOENCODER MODEL USING KERAS VI. EVALUATE THE RESULT
Input_Dimension = Training.shape[1] These are the result of the German Dataset show in Fig. 6,
Hiddenlayer = 16 7 and 8; as we mentioned above that the Dataset was divided
Input_layer = Input(shape=Input_Dimension,)) for training and testing in a ratio 80:20 by using the normal
Encoder1 = Dense(Hidden_layer,activation=“tanh”)(Input_layer)
Encoder2 = Dense(Hidden_Layer/2,activation=”tanh”)(Encoder1)
labeled transactions in the column “Creditability” to find
Encoder3 = Dense(Hidden_Layer/4,activation=”tanh”)(Encoder2) anomalous patterns. These form the AUC and confusion
Decoder1 = Dense(Hidden_Layer/4,activation=”tanh”)(Encoder3) matrix.
Decoder2 = Dense(Hidden_Layer/2,activation=”tanh”)(Decoder1)
Decoder3 = Dense(Input_Dimension,activation=”tanh”)(Decoder2) This form the MSE and RSME from H2O the package of
AutoEncoderModel = Model(inputs=Input_layer,outputs=Decoder3) the German Dataset.
22 | P a g e
Vol. 9, No. 1, 2018
Fig. 6. AUC of German Dataset by using AE. Fig. 9. AUC of Austrian Dataset by using AE.
Fig. 10. Confusion Matrix of Australian Dataset by using AE.
Fig. 7. Confusion Matrix of German Dataset by using AE.
Fig. 11. Auto Encoder Model deep learning report of Australian Dataset
based on H2O framework.
Fig. 8. AE Model of deep learning report of German Dataset on H2O This is the Australian Dataset’s MSE and RSE obtained by
framework. running the H2O package.
Let us move on to another dataset, the Australian Dataset. Here, we move on to the large dataset, the European
The AUC result is given, and the confusion matrix from Keras. Dataset with 284, 807 transactions. The results are shown in
The results are shown in Fig. 9, 10 and 11. Fig. 12, 13 and 14.
23 | P a g e
Vol. 9, No. 1, 2018
Fig. 12. AUC of European Dataset by using AE.

Fig. 15. AUC of German Dataset by using RBM.
Fig. 16. AUC of Australian Dataset by using RBM.
Fig. 13. Confusion Matrix of European Dataset by using AE.
Fig. 17. AUC of European Dataset by using RBM.
Fig. 14. Auto Encoder Model deep learning report of European Dataset based The graph shows the result of the Australian Dataset by
on H2O framework. using the RBM algorithm to implement in Fig. 16. The AUC
score is 0.5238.
As summarized by three datasets, there is lesser data in the
German and Australian datasets. So, when we find anomalies While the biggest dataset is the European Dataset that
in fraud detection, we obtain a lower of AUC, because we produced an AUC value greater than the other two datasets
trained the systems for a small number of data and validated shown above (Australian and German Dataset). The AUC
the test data for a lesser amount. Conversely, when we apply score of European dataset is 0.9505 which can be seen in
this AE model based on Keras with a large amount¸ the Fig. 17.
European Dataset, we got AUC of 0.9603. AE is suitable for This is the summary of AUC’s score that implemented AE
large datasets. and RBM of three different datasets.
Further RBM’s results based on the three datasets are From this research, we can conclude that AE and RBM
presented: we begin by explaining the German Dataset in produce high AUC score and accuracy for bigger datasets,
Fig. 15. As you can see, the AUC of German Dataset is 0.4562.
24 | P a g e
Vol. 9, No. 1, 2018
because there is a large amount of data to be trained. You can REFERENCES

see the details of AUC’s score in Table IV. [1] CyberSource. (2017, Nov. 29). 2017 North AMERCA edition, online
fraud benchmark report persistence is critical [Online]. Available:
TABLE IV. COMPARISON AUC’S SCORE BETWEEN THREE DATASETS http://www.cybersource.com/content/dam/cybersource/2017_Fraud_Ben
chmark_Report.pdf?utm_campaign=NA_17Q3_2017%20Fraud%20Rep
ort_Asset_1_All_Auto&utm_medium=email&utm_source=Eloqua
[2] L. Seyedhossein and M. R. Hashemi, “Mining information from credit
No. of AUC’s score AUC’s score based card time series for timelier fraud detection,” in 5th International
Dataset Name
transactions based on AE on RBM Symposium on Telecommunications (IST’2010), 2010 © IEEE. doi:
978-1-4244-8185-9/10/$26.00
[3] UCI Machine Learning Repository. (2017, Nov. 29). Stalog (Australian
German credit approval) dataset [Online]. Available:
1000 0.4376 0.4562
Dataset http://archive.ics.uci.edu/ml/datasets/Statlog+(Australian+Credit+Appro
val)
[4] UCI Machine Learning Repository. (2017, Nov. 29). Stalog (German
Australian credit data) dataset [Online]. Available:
690 0.5483 0.5238
Dataset http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data
%29
European
284, 807 0.9603 0.9505 [5] N. Malini and Dr. M. Pushpa, “Analysis on credit card fraud
Dataset identification techniques based on KNN and outlier detection” in 3rd
International Conference on Advances in Electrical, Electronics,
Based on two popular datasets, we can conclude that Information, Communication and Bio-Informatics (AEEEICB17).
supervised learning dataset is suitable for history database for [6] A. Charleonnan, “Credit card fraud detection using RUS and MRN
credit card fraud detection. Supervised learning such as algorithm,” in The 2016 Management and Innovation Technology
International Conference (MITiCON-2016), 2016 © IEEE. doi: 978-1-
multilayer perceptron in neural network that uses the prediction 5090-4105-3/16/$31.00
algorithm to identify whether new transactions are legal or [7] M. K. Mishra and R. Dash, “A comparative study of chebyshev
illegal. When a credit card used, the neural network based on functional link artificial neural network, multi-layer perceptron and
the fraud detection system checks for the pattern used by the decision tree for credit card fraud detection” in 2014 13th International
fraudster and corroborates the pattern in question or checks for Conference on Information Technology, 2014 © IEEE. doi: 978-1-4799-
attributes that have been determined as illegal; if the pattern 8084-0/14 $31.00
matches with genuine transaction behavior, then the transaction [8] Y. Pandey, “Credit card fraud detection using deep learning” Int. J. Adv.
Res. Comput. Sci., vol. 8, no. 5, May–Jun. 2017.
is considered legitimate. Conversely, unsupervised learning
[9] A. Niimi, “Deep learning for credit card data analysis,” in World
entails knowing about normal transactions and finding Congrss on Internet Security (WorldCIS-2015), 2015 © IEEE. doi: 978-
anomalous patterns, and then, responding in real-time to the 1-908320-50/6 $31.00
system as a fraud or legal transaction. [10] Single Hidden Layer Neural Network [Online]. Available:
http://nicolamanzini.com/single-hidden-layer-neural-network/
VII. CONCLUSION AND FUTURE WORK [11] Chapter 6 (2017, Aug. 4). Deep learning [Online]. Available:
Nowadays, in the global computing environment, online http://neuralnetworksanddeeplearning.com/chap6.html
payments are important, because online payments use only the [12] Introduction Auto-encoder (2015, Dec. 21). Auto-encoder [Online].
credential information from the credit card to fulfill an Available: https://wikidocs.net/3413
application and then deduct money. Due to this reason, it is [13] M. M. Najafabadi et al., “Deep learning applications and challenges in
big data analytics,” J. Big Data. doi: 10.1186/s40537-014-0007-7
important to find the best solution to detect the maximum
number of frauds in online systems. AE and RBM are the two [14] XLSTAT your data analysis solution [Online]. Available:
https://www.xlstat.com/en/
types of deep learning that use normal transactions to detect
[15] Keras the python deep learning library [Online]. Available:
frauds in real-time. In this study, we focused on ways to build https://keras.io/
AE based on Keras, RBM, and H2O. To verify our proposed [16] H2O API [Online]. Available: http://docs.h2o.ai/h2o/latest-stable/h2o-
methods, we used benchmark experiments with other tools to docs/welcome.html
confirm that AE and RBM in deep learning can accurately [17] A beginner’s tutorial for restricted Boltzmann machine [Online].
achieve credit card detection with a large dataset such as the Available:
European Dataset. Although, for these experiments, it will be https://deeplearning4j.org/restrictedboltzmannmachine#define
better to use real credit card fraud transactions with a huge [18] Credit card fraud detection anonymized credit card transaction labeled
amount of data. We guarantee that AE and RBM can make as fruadulent or genuine [Online]. Available:
more accurate AUC for receiver operator characteristics than https://www.kaggle.com/dalpozz/creditcardfraud/data.
that observable from the results from the European Dataset.
25 | P a g e

Paper 3-Credit Card Fraud Detection Using Deep Learning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Paper 3-Credit Card Fraud Detection Using Deep Learning

Uploaded by

Copyright:

Available Formats

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 9, No. 1, 2018

Credit Card Fraud Detection using Deep Learning

Artificial neural network (ANN) is a flexible computing

III. DEEP LEARNING TECHNIQUE FOR DETECT CREDIT

behavior to be trained first, and then uses the new coming

TABLE I. COMPARISON OF FRAUD DETECTION TECHNIQUE

KNN method can be KNN method is

HMM can detect the

Fig. 5. Credit card fraud detection system using deep leaning.

Fig. 10. Confusion Matrix of Australian Dataset by using AE.

Fig. 7. Confusion Matrix of German Dataset by using AE.

Fig. 12. AUC of European Dataset by using AE.

Fig. 16. AUC of Australian Dataset by using RBM.

Fig. 13. Confusion Matrix of European Dataset by using AE.

Fig. 17. AUC of European Dataset by using RBM.

because there is a large amount of data to be trained. You can REFERENCES

You might also like