Professional Documents
Culture Documents
Model Building Like KNN Model, NB Model GLM Model For R Studio
Model Building Like KNN Model, NB Model GLM Model For R Studio
Model Building Like KNN Model, NB Model GLM Model For R Studio
CA-2
Registration no:11615614
> heart= read.csv(file.choose(), header=T)
> View(heart)
> data=(heart)
> heart$sex=as.factor(heart$sex)
> set.seed(1234)
> intrain=createDataPartition(y=heart$sex, p=0.75, list=F)
> training=heart[intrain,]
> testing=heart[-intrain,]
> dim(training)
[1] 228 14
> dim=(testing)
> modelfit=train(sex~.,data=training,metdod="knn")
> modelfit
Random Forest
228 samples
13 predictor
2 classes: '0', '1'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 228, 228, 228, 228, 228, 228, ... Resampling
results across tuning parameters:
Accuracy was used to select the optimal model using the largest value. The
final value used for the model was mtry = 2.
Reference
Prediction 0 1
0 10 2
1 14 49
Accuracy : 0.7867
95% CI : (0.6768, 0.8729)
No Information Rate : 0.68
P-Value [Acc > NIR] : 0.02855
Kappa : 0.435
Sensitivity : 0.4167
Specificity : 0.9608
Pos Pred Value : 0.8333
Neg Pred Value : 0.7778
Prevalence : 0.3200
Detection Rate : 0.1333
Detection Prevalence : 0.1600
Balanced Accuracy : 0.6887
'Positive' Class : 0
228 samples
13 predictor
2 classes: '0', '1'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 228, 228, 228, 228, 228, 228, ... Resampling
results:
Accuracy Kappa
0.6684758 0.1855419
> predictions1=predict(modelfit1,newdata=testing)
> predictions1
[1] 0 1 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 0 0 0 1 0 0
1 0 0 1 1 1 1 1 1 1 1
[48] 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Levels: 0 1
> confusionMatrix(predictions1,testing$sex)
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 9 6
1 15 45
Accuracy : 0.72
95% CI : (0.6044, 0.8176)
No Information Rate : 0.68
P-Value [Acc > NIR] : 0.27122
Kappa : 0.2857
Sensitivity : 0.3750
Specificity : 0.8824
Pos Pred Value : 0.6000
Neg Pred Value : 0.7500
Prevalence : 0.3200
Detection Rate : 0.1200
Detection Prevalence : 0.2000
Balanced Accuracy : 0.6287
'Positive' Class : 0
228 samples
13 predictor
2 classes: '0', '1'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 228, 228, 228, 228, 228, 228, ... Resampling
results across tuning parameters:
Reference
Prediction 0 1
0 16 6
1 8 45
Accuracy : 0.8133
95% CI : (0.7067, 0.894)
No Information Rate : 0.68
P-Value [Acc > NIR] : 0.0073
Kappa : 0.5614
Sensitivity : 0.6667
Specificity : 0.8824
Pos Pred Value : 0.7273
Neg Pred Value : 0.8491
Prevalence : 0.3200
Detection Rate : 0.2133
Detection Prevalence : 0.2933
Balanced Accuracy : 0.7745
'Positive' Class : 0
INTERPRETATION
1.From the above chart we can see that the accuracy is 0.7867, specificity is 0.9608 and
sensitivity is 0.4167 as per the KNN model.
2. From the above chart we can see that the accuracy is 0.8133, specificity is
0.8824 and sensitivity is 0.6667as per the NB model.
3. From the above chart we can see that the accuracy is 0.72, specificity is 0.8824
and sensitivity is 0.3750as per the GLM model.
❖ After comparing the three models i can see that NB model is the best.
Reference
https://www.kaggle.com/ronitf/heart-disease-uci