Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Predicting Money Laundering Sanctions using

Machine Learning Algorithms and Arti cial Neural


Networks
Mark Lokanan (  mark.lokanan@royalroads.ca )
Royal Roads University

Research Article

Keywords: Sanctions, Anti-Money Laundering, Machine Learning, Arti cial Neural Network, Basel

DOI: https://doi.org/10.21203/rs.3.rs-2511798/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
1. Introduction
The Basel Anti-Money Laundering (AML) Index combines the Financial Action Task

Force (FAFT) and World Bank statistics to analyze country-specific money laundering risks. The

Basel Institute of Governance ("Basel Institute") created the AML index to combat money

laundering and terrorist financing (ML/TF). This paper aims to predict Basel money laundering

sanctions using machine learning (ML) and artificial intelligence (AI) algorithms. This aim is

decomposed into two objectives: (1) predict the likelihood of AML sanctions using risk

indicators from Basel and the World Bank, and (2) find the best predictors for forecasting the

likelihood of sanctions using ML and AI algorithms.

This paper makes two contributions to the literature on money laundering in finance.

Firstly, this paper shows how ML and AI models can be employed to fight money laundering by

identifying the important features in a set of financial and economic indicators. Second, many

countries have AML laws to detect and prevent ML/TF. These AML regulations are not

enforced, limiting their value. AML teams fine-tuned their parameters and risk models to

improve detection, which made compliance less agile and law enforcement more difficult. ML

and AI can fix these problems by using data to find risk factors, rank ML/TF risks, and make

predictions.

While there is a vast literature on the application of ML and AI algorithms to address

classification problems in financial crime detection, the literature on the use of these algorithms

to detect money laundering is still limited (Canhoto, 2021; Chen et al., 2018; Jullum et al., 2020;

Shou et al., 2021). The literature that has been assembled mostly focuses on statistical data

mining techniques to conduct fraud risk profiling and flag suspicious money laundering

transactions (Ferwerda et al., 2019; King et al., 2018; Lokanan, 2019; Sudjianto et al., 2012).

1
Even though there is a rich scholarship on the application of ML and AI to predict money

laundering transactions, this stream of research has mostly concentrated on data from banks and

financial institutions (Lokanan, 2022; Jullum et al., 2020; Zhang & Trubey, 2019) and is often

descriptive and suggestive with very little details on the models and the performance metrics

(Bao et al., 2022; Zhang and Trubey, 2019).

This paper is different for several reasons. First, it uses the same ML and AI-based

techniques to predict countries on the Basel AML sanction blocklist. Second, this project

measures the risks of ML/TF using indicators of countries' adherence to anti-money laundering

and combating the financing of terrorism (AML/CFT) regulations that were collated from FAFT,

Transparency International, the World Bank, and the World Economic Forum.

2. Research Methodology

2.1. Sample and Data Collection

The Basel AML Index is a prominent source of data for this project. Established by the

Basel Institute on Governance in 2012, it is designed to measure the effectiveness of AML/CFT

systems in place across 203 countries. Global economic and development performance data were

retrieved from the World Bank portal. The World Bank data provided more financial and

economic information about the countries and improved the predictive power of the ML and AI

algorithms. The time span for the data is for the year 2021.

2.2. Variables and Measurements

Table 1 presents a list of independent variables (i.e., features) representing various

economic and risk metrics for evaluating a country's AML performance. These variables were

selected for two reasons. Firstly, they comprise the risk scores contributing to the Basel AML

2
index of high-risk ML/TF across countries (see Basel, 2021). Secondly, the World Bank data

consists of standalone risk evaluation solutions and can be employed for independent

benchmarking to validate in-house ML/TF risk assessments (see Jullum et al., 2020; Lokanan,

2022). Taken together, these variables contribute to the factors that predict the probability of a

country being included on the Basel AML sanction list.

Table 1: Independent Variables

Description Data Source Measure


Country Basel Continuous
Annual_Inflation World bank Continuous
Unemployment_rate World bank Continuous
Total_reserves World bank Continuous
Bank_nonperforming_loans World bank Continuous
Bank_liquid_reserves_to_assets_ratio World bank Continuous
ML_TF_risk Basel Continuous
Corruption_Risk Basel Continuous
Bribery_risk Basel Continuous
Public_transparency_accountability Basel Continuous
Political_legal_risk Basel Continuous
Income_group World bank Categorical
Jurisdiction Basel Categorical
Lending_category World bank Categorical

The dependent variable is the country sanctioned. Sanctions are levelled at countries that

ended up on the Basel AML blocklist. Sanctions were coded as 0 (no) when the country was not

sanctioned and 1 (yes) when the country was sanctioned.

2.3. SMOTE+ENN

This project involves working with an imbalanced dataset. Class imbalance occurs when

one label of the target variable has far more instances than the other, leading to an imbalance in

the model's ability to generalize on unseen data (Lokanan, 2022). Countries sanctioned account

3
for 27% of the data, while countries not sanctioned account for 73%. To address the class

imbalance problem, the synthetic minority over-sampling (SMOTe) and edited nearest neighbour

(ENN) algorithms were used to up-sample the data. SMOTe is a technique that helps to reduce

bias towards the minority class in a dataset by allowing the algorithm to better learn from and

classify elements from the data. The ENN algorithm aims to maximize accuracy and diversity in

the data by generating synthetic instances from existing observations while removing, if

necessary, those that could contaminate the minority class sample (Lokanan, 2022).

2.4. Algorithms Considered

Some of the more representative classifiers used in money laundering prediction studies

are linear and probabilistic algorithms such as logistic regression, support vector machine

(SVM), and Naive Bayes (Cerulli, 2021; Chen et al., 2018; Zhang and Trubey, 2019). Others

have employed non-linear and more complex algorithms, namely ANN and tree-based

classifiers, to provide new insights into classification and predictive money laundering problems

(Chen et al., 2018; Jullum et al., 2020; Zhang and Trubey, 2019). ML and AI-based algorithms

can apply pre-set risk criteria to train, test, and validate the models on unseen data (Lokanan,

2022). These advanced ML and AI techniques can recognize patterns in large volumes of data

across various formats, making them highly effective for uncovering risks and abnormal

transactions (Bao et al., 2022; Jullum et al., 2020). Their ability to continually train and improve

through reinforcement learning allows these algorithms to become even more reliable over time

(Chen et al., 2018). Consequently, ML and AI methods are crucial for consolidating and

advancing risk profiling and money laundering detection research.

This paper implemented various algorithms to assess performance. Logistic regression,

SVM, and Naive Bayes classifiers were employed to address linearity in the data. Random

4
forest, gradient descent, and ANN algorithms were used to approximate non-linear patterns

within the data (Breiman, 2001; Chen et al., 2018; Zhang and Trubey, 2019). These algorithms

and their variations allow for robust data analysis while preventing potential bias or unreasonable

assumptions (Cerulli, 2020). Appendix 1 shows the parameters used to optimize and tune the

models.

2.5. Classification Metrics and Performance

The confusion matrix is used to calculate the performance of ML and ANN models

(Lokanan, 2022). Four classes represent the confusion matrix for binary classification models:

True Positive (TP): The algorithm predicts the country will be sanctioned, and the country is
sanctioned.
True Negative (TN): The algorithm predicts that the country will not be sanctioned and was not
sanctioned.
False Positive (FP): The algorithm predicted that the country would be sanctioned, but it was
not sanctioned.
False Negative (FN): The algorithm predicted the country would not be sanctioned but was
sanctioned.

Table 2 presents information on the evaluation metrics used in this study. The most

important criterion for benchmarking the performance of ML and ANN models is accuracy;

however, because of the imbalanced dataset used in this study, balanced accuracy (BAC)

provides a more comprehensive evaluation of model performance. Other approaches have

included the allowance of non-Gaussian errors, nonzero means, and serially and

contemporaneously correlated errors in forecasting accuracy (Diebold and Mariano, 2002). In

more recent applications, population estimates, and model coefficients have been used to

evaluate and improve performance predictions (Clark and McCracken, 2013). While it exhibits

limitations, such as sensitivity to sample size and length of data used to estimate the model, the

Diebold–Mariano test offers a viable solution for comparing two predictive models for

5
forecasting accuracy without making assumptions about the underlying data distribution

(Diebold, 2015).

With imbalanced datasets, accuracy is not the best predictor for classification problems

because it ignores FPs and FNs (Cerulli, 2020; Lokanan, 2022). Sensitivity, specificity, accuracy,

F-scores, and receiver operating characteristics (ROC) curve are better metrics for imbalanced

datasets (Shou et al. 2021). The True Positive Rate (TPR) measures the model's sensitivity.

Specificity evaluates the proportion of cases accurately categorized as TN and presents the True

Negative Rate (TNR), or all predicted negative observations. The precision ratio measures the

model's accuracy in predicting positive classes, and the F-measure combines precision and

sensitivity/recall. The ROC curve is combined with the area under the curve (AUC) and the ROC

curve (AUROC) and is more robust to imbalanced data.

Table 2: Evaluation Metrics

Metrics Formulae
TP+TN
TP+FP+FN+TN
Accuracy
TP FP
Balanced Accuracy ( + ) /2
TP+FN TN+FP

ROC Area Under the Curve TP FP


(AUC) Plot the TPR and FPR to a single score
TP+FN TN+FP
TP
Senitivity@k TP + FN
TN
Specificty@k TN + FP
TP
Precision TP + FP
F-measure 2*(Precision * Recall)/Precision+Recall

6
3. Empirical Results

3.1. Summary Statistics

Table 3 presents the summary statistics of the numerical features. Countries with high

unemployment rates and low total reserves are likelier to be on the Basel sanctions list than

others. Basel assigns a risk score ranging from 0 (low) to 10 (high) (see Basel, 2021). The

average overall risk score for most countries was 5.4. Countries with poor ML/TF risk scores had

a mean of 5.7 and a standard deviation of 1.4, indicating significant variations between countries.

Over 75% of the sanctioned countries' scores exceed this mean value. Regarding financial

transparency, about 75% of the sanctioned nations scored higher on the risk scale (6.95)

compared to the average of 5.5.

Table 3: Summary Results

Index count mean std min 25% 50% 75% max


Annual_Inflation 188 18.88 121.58 -0.80 2.00 3.55 6.04 1588.50
Unemployment_rate 203 8.52 6.39 0.26 4.11 6.43 11.06 36.00
Total_reserves_in_Billion 180 84.12 304.93 0.02 1.50 7.37 41.96 3427.93
Bank_nonperforming_loans 42 5.42 5.54 0.64 2.08 3.86 6.66 31.72
Bank_liquid_reserves_to_assets_ratio 131 25.82 21.95 0.23 11.94 19.78 30.21 121.09
Overall_Score 203 5.44 1.29 2.35 4.55 5.24 6.53 8.62
ML_TF_Risk 203 5.68 1.36 2.44 4.71 5.50 6.72 8.81
Corruption_Risk 193 5.14 1.90 0.60 3.64 5.47 6.53 9.20
Bribery_risk 192 4.79 1.90 0.51 3.42 5.03 6.09 10.00
Financial_Transparency_Standards 189 5.47 2.10 1.85 3.83 5.13 6.95 10.00
Public_Transparency_Accountability 176 4.80 2.31 0.65 2.85 4.72 6.43 10.00
Political_Legal_Risk 194 4.62 1.92 0.67 3.17 4.90 5.91 9.37

7
3.2. Model Performance

3.2.1 Accuracy

One of the hallmarks of an overfitted model is a large discrepancy between the

performance accuracy’s training and testing scores. As seen in Table 4, except for the ANN

model, there are no significant discrepancies in the training and testing scores for the other

algorithms. The logistic regression (84%) and SVM (84%) models scored highest in accuracy,

indicating they both performed exceptionally well in predicting sanctions. Additionally, the

logistic regression classifier had a slightly higher BAC (85%) than the SVM (84%), further

solidifying its superiority among these models. Together, these results suggest that logistic

regression and SVM can be effective tools for predicting money laundering sanctions.

Table 4: Performance Accuracy

Accuracy Score BAC Accuracy


Algorithm Training Testing Training Testing
Logistic regression 0.81 0.84 0.80 0.85
Random Forest
0.72 0.69 0.72
GridSearchCV 0.69
SVM 0.81 0.84 0.81 0.84
Naïve Bayes 0.75 0.71 0.75 0.71
Gradient descent 0.82 0.81 0.97 0.82
ANN 0.97 0.80 0.97 0.81

3.2.2 Model Performance with Classification Metrics

Table 5 presents the evaluation metrics for money laundering sanctions. The ANN (84%),

followed by the logistic regression (81%) models, achieved the highest sensitivity scores,

meaning they did a good job predicting whether Basel sanctioned countries. Logistic regression

and SVM had the highest specificity score (88%) when classifying countries that were not

8
sanctioned (TN). Furthermore, when identifying which countries were sanctioned, the logistic

regression (88%) and the SVM (87%) models had the highest precision scores. Likewise, logistic

regression (84%) and SVM (83%) had the best F-scores for predicting sanctions.

Table 5: Performance Accuracy of the Classification Algorithms

Algorithm Precision Recall/Sensitivity Specificity F-measure


Logistic regression 0.88 0.81 0.88 0.84
Random Forest
GridSearchCV 0.65 0.79 0.58 0.71
SVM 0.87 0.79 0.88 0.83
Naïve Bayes 0.70 0.72 0.70 0.71
Gradient descent 0.83 0.79 0.84 0.81
ANN 0.78 0.84 0.77 0.81

3.2.3. ROC Curve

The ROC curve is resistant to class distribution (Chen et al., 2018; Lokanan, 2022). As

shown in Figure 1, the AI sequential ANN model outperformed all other classifiers with an

AUROC score of 90%. Of the ML classifiers, the SVM (87%), gradient descent (86%), and

logistic regression (85%) models had the highest AUROC scores in predicting sanctions. In this

case, the AUROC provides insights into the effectiveness of each classifier and calculates the

risk of a country being sanctioned for AML violations.

9
Figure 1: AUROC Scores

3.2.4. Feature importance

Figure 2 indicates which features of a country are most relevant to predict sanctions.

Financial transparency, political and legal risks, unemployment rate, and ML/TF risks are the

top indicators that a nation will end up on a Basel AML sanction list. While inflation, public

transparency, accountability, and total reserves are significant predictors, they play lesser

roles than the top four. These features offer invaluable insights into detecting potential hot

spots and forecasting countries that face an increased chance of receiving sanctions.

10
Figure 2: Variable Importance

4. Conclusion
Altogether, these findings demonstrate that money laundering sanctions can be accurately

predicted using ML and AI algorithms, with logistic regression and SVM being the best-

performing models. Notably, the AI sequential model performed very well in this regard; the

11
sensitivity scores and AUROC measures showed that the ANN classifier successfully predicted

countries at risk of money laundering sanctions from Basel. Financial transparency, political and

legal risks, the unemployment rate, and factors associated with ML/TF risks were among the

strongest predictors of sanctions. Future research could investigate other algorithms using

different datasets to identify money laundering risks and allow for timely interventions.

Appendix 1: Hyperparameters used to tuned ML and AI Algorithms

Algorithm Hyperparameters
Logistic Regression Regularisation strength (c) class_weight solver max_iterations
default=1.0 {0: 1, 1: 1} (All class have weight 1) lbfgs 100
Random Forests n_estimators = 200 max_features = log2 max_depth =7 criterion = gini
SVM kernel = linear default= 1.0 probability=true
Naives Bayes default = none
Gradient Descent loss= hinge penalty=12 alpha =0.01 weight = balance
ANN activation = relu model=crossentrpphy optimizer=adam

Reference:

Basel Governance. (2021). Basel AML Index | Basel Institute on Governance. Basel Institute on
Governance. https://baselgovernance.org/taxonomy/term/483

Bao, Y., Hilary, G., and Ke, B. (2022). Artificial intelligence and fraud detection. In Innovative
technology at the interface of finance and operations (pp. 223-247). Springer, Cham.

Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.


https://doi.org/10.1023/a:1010933404324

Canhoto, A. I. (2021). Leveraging machine learning in the global fight against money laundering
and terrorism financing: An affordances perspective. Journal of business research, 131,
441-452, https://doi.org/10.1016/j.jbusres.2020.10.012

Cerulli, G. (2021) Improving econometric prediction by machine learning, Applied Economics


Letters, 28(16), 1419-1425, DOI: 10.1080/13504851.2020.1820939

Clark, T., & McCracken, M. (2013). Advances in forecast evaluation. Handbook of economic
forecasting, 2, 1107-1201.

Chen, Z., Van Khoa, L. D., Teoh, E. N., Nazir, A., Karuppiah, E. K., and Lam, K. S. (2018).
Machine learning techniques for anti-money laundering (AML) solutions in suspicious
transaction detection: a review. Knowledge and Information Systems, 57(2), 245-285.

12
Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business &
economic statistics, 20(1), 134-144.DOI: 10.1198/073500102753410444

Diebold, F. X. (2015). Comparing predictive accuracy, twenty years later: A personal


perspective on the use and abuse of Diebold–Mariano tests. Journal of Business &
Economic Statistics, 33(1), 1-1.

Ferwerda, J., Deleanu, I. S., & Unger, B. (2019). Strategies to avoid blacklisting: The case of
statistics on money laundering. PloS one, 14(6), e0218532.

Jullum, M., Løland, A., Huseby, R. B., Ånonsen, G., and Lorentzen, J. (2020). Detecting money
laundering transactions with machine learning. Journal of Money Laundering Control,
23(1), 173-186.

King, C., Walker, C., & Gurulé, J. (Eds.). (2018). The Palgrave handbook of criminal and
terrorism financing law. Cham: Palgrave Macmillan.

Lokanan, M. E. (2022). Predicting Money Laundering Using Machine Learning and Artificial
Neural Networks Algorithms in Banks. Journal of Applied Security Research, 1-25.
https://doi.org/10.1080/19361610.2022.2114744

Lokanan, M. E. (2019). Data mining for statistical analysis of money laundering transactions.
Journal of Money Laundering Control, 22(4), 753-763.

Shou, M., Bao, X., and Yu, J (2021) An optimal weighted machine learning model for detecting
financial fraud. Applied Economics Letters, 1-6,
https://doi.org/10.1080/13504851.2021.1989367

Sudjianto, A., Nair, S., Yuan, M., Zhang, A., Kern, D., & Cela-Díaz, F. (2010). Statistical
methods for fighting financial crimes. Technometrics, 52(1), 5-19.

Zhang, Y., & Trubey, P. (2019). Machine learning and sampling scheme: An empirical study of
money laundering detection. Computational Economics, 54(3), 1043-1063.

13

You might also like