Professional Documents
Culture Documents
Improving Heart Failure Prediction
Improving Heart Failure Prediction
prediction using
PCA as a dimension
reduction technique.
What my dataset looks like!
As you can see , I can’t throw this dataset into my machine learning model because it contains a lot
of categorical variables.
Data preparation
I tried to explore my dataset to see if there is any missing values and fortunately , there
is none.
NB: When using One-Hot Encoding , you must remember to drop first in the argument
as this helps to remove the issue of multicollinearity
Scaling of my dataset
Firstly, I created a
dictionary for all the
models I wanted to
use and their
parameters .
How I used GridSearchCV
Best scores without PCA
I need to say this that PCA doesn’t doesn’t mean that the
accuracy of our model will increase, usually it decreases , but
computation is much lighter and this is some of the trade off
we consider in the industry.
PCA reduces the number of variables in a dataset while
maintaining as much information as possible. It transforms the
original variables into a new set of variables, which are called
principal components. These components are ordered so that
the first few retain most of the variation present in all of the
original variables
How I implemented PCA
Model accuracy significantly improved after PCA to about
89.67% . PCA reduced my features to 13 from the initial 16
components
Thank you