Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

AI Principles and Applications

Kaggle, Open Data Sets etc


Process of machine learning
• Data cleaning and formatting
• Exploratory data analysis
• Feature engineering and selection
• Training a model
• Evaluate the best model on the testing set
• Interpret the model results
• Draw conclusions and document work

Farooq Anjum 2
What is Kaggle and how to use it

Farooq Anjum 3
Active Competitions

Farooq Anjum 4
Zillow home value prediction

Farooq Anjum 5
Train-Test split

Farooq Anjum 6
Files

Farooq Anjum 7
Features

Farooq Anjum 8
Features

Farooq Anjum 9
Features

Farooq Anjum 10
Process of machine learning
• Kaggle did the following
• Data cleaning and formatting
• Exploratory data analysis
• Feature engineering and selection
• You had to do the following
• Training a model
• Evaluate the best model on the testing set
• Interpret the model results
• Draw conclusions and document work

Farooq Anjum 11
Process of machine learning
• In real life you have to do all the tasks below
• Data cleaning and formatting
• Exploratory data analysis
• Feature engineering and selection
• Training a model
• Evaluate the best model on the testing set
• Interpret the model results
• Draw conclusions and document work

Farooq Anjum 12
Open Data sets

Farooq Anjum 13
Data.gov dataset areas

Farooq Anjum 14
An example area – Agriculture

Farooq Anjum 15
Task
• Explore the agriculture OR climate area datasets
• And focusing on a dataset explain what knowledge can you extract
from this data and how will this be beneficial for humans

Farooq Anjum 16

You might also like