Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Predicting Car Insurance Frauds

Project Proposal

Francesco Buzzi Poongkundran Thamaraiselvan Roshan Velpula

b00804356@essec.edu b00798732@essec.edu b00802760@essec.edu

Introduction fraud than fraud – we might consider specific techniques for


tackling this type of sets, like under-sampling – that reduces the
Vehicle safety is a remarkably significant field of research, not training set by removing some not fraud observation – and
only for automotive industry, but also for insurance companies. It understand whether they might be useful or not.
is in fact, of considerable importance for them to avoid
compensating damages, especially when dealing with fraudulent
claims. False claims represent a vast and costly problem for these Evaluation
firms and prevent to repaid them it is absolutely crucial not to end
The evaluations will be made on few final models, using a test set
up with financial losses over billions of dollars yearly.
and considering confusion matrices and its associated measures
Thus, being able to predict whether a claim is legitimate or not is (accuracy, F-measure, etc.) and ROC curves. Furthermore, we
a fundamental task that might be tackled through machine will consider different thresholds – probably above the standard
learning and use of data. Quick and automated process of 0.5 for classification tasks – that might be more efficient in
discovering a fraud could give an essential advantage to insurance correctly classifying more frauds, sacrificing some correct not
companies to fight against fraudsters, replacing traditional and fraud labelled data.
complex methods – that usually leads to inaccurate results.
Moreover, similar approaches can be used in other case of frauds’
detection, for example in bank transaction or online shopping.
References
[1] Andrea Dal Pozzolo. Adaptive machine learning for credit card fraud
detection.
Methodology
[2] Najmeddine Dhieb, Hakim Ghazzai, Hichem Besbes, and Yehia
The dataset we have decided to use is publicly available and can Massoud. Extreme gradient boosting machine learning algorithm for
safe auto insurance operations. In 2019 IEEE international conference
be found on Kaggle. Each observation represents a claim, with on vehicular electronics and safety (ICVES), pages 1–5. IEEE, 2019.
information about the day, the vehicle, the people involved, the
[3] MOHAMED Hanafy and Ruixing Ming. Using machine learning
policy and the accident itself. In addition, each observation is models to compare various resampling methods in predicting insurance
labeled by a binary value, fraud and not fraud (1 and 0 fraud. Journal of Theoretical and Applied Information Technology,
respectively), that will be our target variable. 99(12), 2021.

[4] Jesus M Perez, Javier Muguerza, Olatz Arbelaitz, Ibai Gurrutxaga, and
We plan to divide our project into three main parts: the first part Jose I Martın. Consolidated tree classifier learning in a car insurance
will be about exploratory analytics – to visualize the data and fraud detection domain with class imbalance. In International
Conference on Pattern Recognition and Image Analysis, pages 381–389.
better understand features; the second will consist in using simple Springer, 2005.
machine learning models (Logistic Regression, Regularization,
[5] Yibo Wang and Wei Xu. Leveraging deep learning with lda-based text
Feature Selection, Classification Trees) to understand which analytics to detect automobile insurance fraud. Decision Support
features might be more influential for predicting and their Systems, 105:87–95, 2018.
relationship with the target variable. Eventually, we will develop [6] Meryem Yankol-Schalck. The value of cross-data set analysis for
our final model, exploiting Random Forest or Gradient Boosted automobile insurance fraud detection. Research in International
Business and Finance, 63:101769, 2022.
Trees, to make the predictions and evaluate the best fit.
Moreover, as we are dealing with a dataset with skewed class
distribution – there are far more observations labelled as not

You might also like