Professional Documents
Culture Documents
4-R Code and PPT - Predicting Medical Expenses Using Linear Regression - New Without Prerequsit
4-R Code and PPT - Predicting Medical Expenses Using Linear Regression - New Without Prerequsit
4-R Code and PPT - Predicting Medical Expenses Using Linear Regression - New Without Prerequsit
Predictio
n
Line Medi
ar cal
Reg. Exp.
https://drive.google.com/file/d/
1QN5TSG3pJHo8v05gVsPKbvLwOyyaDy8t/view?usp=sharing
https://www.rstudio.com/products/rstudio/download/#download
getwd ()
(see the current working directory)
setwd("E:/Dell laptop back up/ laptop /Invited Talk ")
(set your own working directory where your all stuff is saved)
str(insurance)
table(insurance$region)
summary(insurance)
age sex bmi children smoker region
Min. :18.00 female:662 Min. :16.00 Min. :0.000 no :1064 northeast:324
1st Qu.:27.00 male :676 1st Qu.:26.30 1st Qu.:0.000 yes: 274 northwest:325
Median :39.00 Median :30.40 Median :1.000 southeast:364
Mean :39.21 Mean :30.67 Mean :1.095 southwest:325
3rd Qu.:51.00 3rd Qu.:34.70 3rd Qu.:2.000
Max. :64.00 Max. :53.10 Max. :5.000
expenses
Min. : 1122
1st Qu.: 4740
Median : 9382
Mean :13270
3rd Qu.:16640
Max. :63770
summary(insurance$expenses)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1122 4740 9382 13270 16640 63770
Coefficients:
(Intercept) age children bmi sexmale
-11941.6 256.8 475.7 339.3 -131.4
smokeryes regionnorthwest regionsoutheast regionsouthwest
23847.5 -352.8 -1035.6 -959.3
summary (lr_model)
Call:
lm(formula = expenses ~ age + children + bmi + sex + smoker +
region, data = insurance)
Residuals:
Min 1Q Median 3Q Max
-11302.7 -2850.9 -979.6 1383.9 29981.7
err = residuals(lr_model)
> hist(err) // should be Gaussian curve
summary (lr_model1)
Call:
lm(formula = expenses ~ age2 + children + bmi + sex + +smoker +
region, data = insurance)
Residuals:
Min 1Q Median 3Q Max
-11426.9 -2857.9 -935.7 1307.2 30662.4
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7551.8397 925.4338 -8.160 7.67e-16 ***
age2 3.2535 0.1476 22.039 < 2e-16 ***
children 613.0908 136.9329 4.477 8.21e-06 ***
bmi 335.6486 28.4557 11.795 < 2e-16 ***
sexmale -136.4240 331.1067 -0.412 0.6804
summary (lr_model5)
Call:
lm(formula = expenses ~ age + age2 + children + bmi + sex + +bmi *
smoker + region, data = insurance)
Residuals:
Min 1Q Median 3Q Max
-17297.1 -1656.0 -1262.7 -727.8 24161.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 139.0053 1363.1359 0.102 0.918792
age -32.6181 59.8250 -0.545 0.585690
age2 3.7307 0.7463 4.999 6.54e-07 ***
children 678.6017 105.8855 6.409 2.03e-10 ***