Professional Documents
Culture Documents
Assignment 1
Assignment 1
INTRODUCTION
competition is inspired by the Bank Marketing dataset available at UCI Repository.
We had to use knime and python to predict if the client will subscribe a to term
deposit (variable y).
The accuracy matrix was determined through AUC.
SUMMARY
Presentation title 3
Random Forest
0.91850 0.92003 0.00153
(knime)
KNN Knime 0.92319 0.92773 0.00454
Tree Ensembler
0.75983 0.75983 0
(knime)
Naïve Bayes
0.59427 0.59427 0
(knime)
Tree Ensembler
with data
0.78910 0.92793 0.13883
manipulation
(knime)
Random Forest
0.87445 0.91968 0.04523
(python)
Stacking (Python) 0.92625 0.92856 0.00231
Presentation title 4
K NEAREST NEIGHBORS(KNIME)
• I also tried K nearest neighbors.
• I got an accuracy of 0.75983 (knearsetneighbor.csv) which was lower than
all other methods by a lot so I was a bit disappointed by that.
• I tried changing the trees to 2 since we are doing binary classification and
my result dropped even further to 0.73965 (knearsetneighbor2.csv).
• This prompted me to not use knn anymore.
Presentation title 7
STACKING (PYTHON)
• I then used sklearn.stackingClassifier with logistic regression as the final
estimator final estimator.
• I was stacking RandomForestClassifier and ExtraTreesClassifier first,
which gave me an accuracy of 0.92525.
• I then added the GradientBoostingClassifier to get an accuracy of
0.92625.
• I then changed some parameters(bootstrap,warm start), and got upto
0.92856. (hello20.csv)
• This is the best accuracy I got.
THANK YOU
Zaid Bin Haris 22868