Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

MEDICAL INSURANCE

COST PREDICTION
SYSTEM

Dharesh Bahety
EN18EL301057
Under the Guidance of
Mr. Parag Ravekar Sir
 Introduction
Table of Contents

 Literature Review
 Objectives
 Methodology
 Tool used in the project
 Work done till date
 Summary

 Reference
INTRODUCTION
People assume Medical Insurance costs to be high due to which they avoid taking
Medical Insurance for themselves and their family, though it has become a necessity
now. This project designs an automatic system that can predict what the medical
insurance cost of a person will be.

In classification, learning algorithms takes the input data and map the output to a
discrete output like True or False. In regression, learning algorithms maps the input
data to continuous output like weight, cost, etc.

So in this project, I am going to design a machine learning system using linear


regression model that can learn from the data and it can predict what the cost can be.
LITERATURE REVIEW
Referring to the “The Supervised Learning Methods for Predicting Healthcare Costs” by “ Kensaku
Kawamoto”, the author explains that they have identified 5 methods of predicting the Medical Insurance
costs , and they evaluated performance of each method. The data set used by them consisted of 90,000
individuals, 6.3 million medical claims and 1.2 million pharmacy claims approx. In this comparison, a method
known as “gradient boosting” which is suited for low to medium cost individuals and it had the best predictive
performance overall. For high cost individuals, highest performance was reported for Artificial Neural
Network (ANN) and the Ridge regression model. The author broadly classifies three kinds of methods that
have been reported for cost prediction: rule-based, statistical and supervised learning. The author identifies a
limitation to the study that they used one data set of a particular region, and wanted to study further exploring
the use of more advanced supervised learning methods such as deep learning and structure analysis.
OBJECTIVES
• To develop a one-stop system that predicts Medical Insurance cost in no time.
• Helps achieve transparency in cost decision with no hidden/ commission costs.
• Easy for individual to plan their savings / investments for taking Medical
Insurance for themselves and family.
Proposed Solution
• Using the system designed, a person can predict Medical Insurance cost of
individuals of all ages.

• This will help motivate parents to insure their children from early age , as
Insurance prices are low for infants and children compared to adults.

• People are free to decide which policy they wish to buy and estimate cost of
the policy, which is important to decide the coverage they will get and wish to
have.


Linear Regression Model
• Linear regression attempts to model the relationship between two variables by
fitting a linear equation to observed data. One variable is considered to be an
independent variable, and the other is considered to be a dependent variable.
• Let us understand the algorithm to design linear regression through an
example. Say we have a data set of years of experience and salary per year as
shown in fig. We plot the data in a graph and try to determine the best fit (blue
line in fig 2 ).
•Since it is a linear graph, so best fit will be in the form of Y = mX + c . Slope
is determined through given formula, m = (y2 – y1)/(x2 – x1). So here slope
will be m = =
And intercept will be

c= 200000
So our equation for best comes out to be Y = 200000 (X+1)
Once we determine the slope and intercept, we need to design a function that
will replicate the functionality of the best fit, and hence helps in prediction.
Fig: Example of linear regression curve
METHODOLOGY

Insurance Data Data Pre- Train - Test


cost data Analysis processing Split

Model Design

Testing of Prediction of
Input data insurance cost
trained model
Tools Used in the Project
• Python 3: Python is an interpreted high-level general-purpose programming language.
The language constructs as well as its object-oriented approach aim to help
programmers write clear, logical code for small and large-scale projects.

• Google Colab: Colaboratory, or “Colab”, is a product from Google Research.


Colab allows anybody to write and execute arbitrary python code through the
browser, and is especially well suited to machine learning, data analysis and
education.
Experimentation
SUMMARY

Collected raw data set and Uploaded .csv file on IDE. Found the dependencies to be imported
which are Libraries and functions needed where our libraries are numpy,pandas,matplot,seaborn. The
inbuilt functions used are train_test_split and LinearRegression. The collected data was then analysed
and then we plotted graphs of the analysed dataset, a few examples of which are Distribution of age,
gender, BMI etc as mentioned above.

Then I had carried out data pre-processing which makes raw data compatible for Machine Learning
Algorithm. Then splitting data into Training and Testing data. Then had trained the machine using
Training data and evaluate the performance using Test data. This was fed to our Machine Learning
model which makes it a Trained model. Now the trained model will give Estimate Insurance cost as
output based on input data.
Conclusion

The conclusion of this project is to use the designed system to predict the
Medical Insurance Cost of an Individual depending on their input parameters.
This model gives high accuracy and hence is good to be adopted in the field of
health care and insurance sector.
REFERENCE
 Demsar J. “Statistical comparisons of classifiers over multiple data sets”. The Journal of Machine
Learning Research. 2020;7:1–30
 Mohammad Amin Morid,Kensaku Kawamoto, Travis Ault,Josette Dorius,Samir Abdelrahman
”Supervised Learning Methods for Predicting Healthcare Costs” David Eccles School of Business,
University of Utah PMCID: PMC5977561 2020
 Duncan I, Loginov M, Ludkovski M. Testing Alternative Regression Frameworks for Predictive
Modeling of Health Care Costs. North American Actuarial Journal. 2019
 Pradeep kr, Naveen Aradhya “ A Collective Study of Machine Learning (ML) Algorithms with Big
Data Analytics (BDA) for Healthcare Analytics (HcA)” International Journal of Emerging Trends
2018
THANK YOU

You might also like