Professional Documents
Culture Documents
Assignment - #4 - Decision Tree and Ensemble - Final
Assignment - #4 - Decision Tree and Ensemble - Final
Name:
_______________________________________________________________________________
Email id:
_____________________________________________________________________________
Assignment: Heart Disease Classification using Machine Learning (Decision tree and Ensemble
methods).
Objective:
Your task is to implement, evaluate, and compare various machine learning classifiers for predicting heart
disease. Employ advanced techniques for a thorough analysis of the data and classifiers’ performance.
Dataset Description: This dataset consists of 11 features and a target variable. It has 6 nominal variables
and 5 numeric variables. The target variable which we must predict 1 means patient is suffering from
heart risk and 0 means patient is normal.
Tasks:
Load and inspect the dataset’s structure, summary statistics, and data types.
Feature Selection: Decide which features are relevant for the classification task.
Data Splitting: Partition the dataset into training and testing sets (80-20 split).
Define parameters for SVM, Logistic Regression, Decision Tree, and Random Forest classifiers.
Initialize and train using a pipeline comprising StandardScaler and the model.
Compute and report accuracy, classification report, and confusion matrix on the testing set.
Visualize and interpret the confusion matrix.
4. Ensemble Learning:
Construct a Voting Classifier using the classifiers trained above. Experiment with both ‘hard’ and ‘soft’
voting strategies.
Evaluate and visualize its performance, drawing comparisons with the individual models.
Offer insights into which model(s) performed best and hypothesize why.
Deliverables:
Code Notebook: Well-commented Jupiter Notebook with sections corresponding to the tasks outlined.
Ensure your code is clean, readable, and well-documented.
Report: Concise report presenting your approach, findings, visualizations, and recommendations. The
report should be structured, coherent, and professionally formatted.
Evaluation Criteria:
Analysis Depth: Extent of EDA, feature selection rationale, and hyperparameter tuning.
Model Evaluation: Appropriateness of metrics used, depth of evaluation, and clarity in visualizations.
Report Quality: Clarity, structure, depth of insight, and quality of writing in the report.
Deadline: