Professional Documents
Culture Documents
CPE EL3 Lab 5 Validation Set
CPE EL3 Lab 5 Validation Set
CPE EL3 Lab 5 Validation Set
LABORATORY EXERCISES
I. OBJECTIVES
Internet connection
Google account
Google Colaboratory
III. CONCEPTS/THEORY/CONTENT
This exercise looks into the technique of splitting the data set into training, validation and test sets,
significantly reducing the issue of overfitting , an issue that results from over-training the model to have
good prediction results using trained data, but performs poorly on new-unforeseen data. The three
partitions allow models to work on three sets:
Training sets – data sets (records) to be used for learning the relationships behind features and
label
Validation sets – data sets (records) used for evaluating the model’s prediction performance.
Results and observations made from this set are feedbacks used for tweaking or adjust the model
Test sets – the set used for confirming the model performance
The validation set further protects the test set from “getting too familiar” to the model, thus maintaining
the representation as unforeseen data, to which the end-use of the model is to be applied. The process
cycle is shown in Figure 1.
1. Open the Google Developers (2020) Colaboratory notebook for this exercise:
https://colab.research.google.com/github/google/eng-
edu/blob/main/ml/cc/exercises/validation_and_test_sets.ipynb?utm_source=mlcc&utm_campa
ign=colab-external&utm_medium=referral&utm_content=validation_tf2-colab&hl=en
2. Save a personal copy of the notebook: File > Save a copy in Drive
3. Do the exercises and tasks until completion. Add defined functions for prediction and use it.
https://developers.google.com/machine-learning/crash-course/validation/another-partition