Professional Documents
Culture Documents
Lab Assessment 2 - Question
Lab Assessment 2 - Question
Load and explore the dataset provided to understand its structure, determine the type
of supervised problem (classification or regression), and apply appropriate techniques
and algorithms to solve the problem.
The dataset contains 16 input features, with the final feature serving as the target
variable.
1. Load the dataset as df and split it into training and test sets. Note that the file is
in CSV format, but the delimiter is not a common (“,”), it s a semicolon
(“;”). Use the delimiter parameter in the read_csv method to correctly
load the data. [1]
Note: The missing values in the dataset are represented by the string “unknown”
3.2. Create pipelines for both numeric and categorical features. [1.5]
4. Training ensemble model.
1
[This question paper contains two printed pages]
4.1. Select three base models and construct a basic stacking ensemble model.
Keep the default algorithm as the final estimator for the stacking
ensemble. Ensure to create a pipeline model for each base model and
train the stacking model accordingly. [3]
4.3. Within the pipeline of your boosting model, add a feature selection
technique. Use any preferred feature selection technique to select the top
10 features. [2]
5.2. Compare the stacking model and boosting model. Which one is
performing better? why? [1.5]
Use the link below to access the dataset for this task.
https://drive.google.com/file/d/1PZJH6WSZLwhYkFOKNyO57BxHUrUhja0a/view?usp=sh
aring
6. Load the dataset and create a new data frame named rdf. Assign the column
names of features as “x” and “y”. [1]
8. Use the elbow method to determine the optimal value of k for the KMeans
algorithm. Provide a reason for the selection of a specific k value. [2]
9. Create the KMeans model using the k value you got from the above step and
train the model. [1]
10. Plot your clustering result showing different clusters in different colors. Also,
plot your cluster centroids. [2]
11. Evaluate your clustering model. Is your model good? Justify your answer. [2]
ALL THE BEST ☺