Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 34

AI for Future

Workforce
Facilitator Guide
Module 11: (AI Project Cycle applied to
Common Trade Applications)
Total Session Duration: 1440 MINUTES
No. of Facilitators: 2
No. of Students: 40

Facilitator Guide

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by Intel Corporation. ©Intel
Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and
brands may be claimed as the property of others. All rights reserved. Program dates and lesson plans are subject to change. Intel
technologies may require enabled hardware, software, or service activation. No product or component can be absolutely secure.
Results have been estimated or simulated. Intel does not control or audit third-party data. You should consult other sources to
evaluate accuracy. Your costs and results may vary.
Dear Facilitators, this guide is meant to serve as a resource in conjunction with the slide deck to
deliver the program in a standard manner. It includes different sections like activities guide,
remote learning and suggested reading material, etc which provide tips and techniques to
optimally deliver the program content. Please note that this guide need not be followed as is.
Since you are the one who knows your students best, you are free to modify the content and
make any necessary changes to suit the needs of your students.

Lesson Title: Approach:


Module 11: AI Project Cycle applied to Common Facilitator-led Interactive Session
Trade Applications

Summary:
Students will work on the preceding industry-relevant applications using Python coding.
Students will deepen their knowledge of AI Project Cycle on diverse industry-relevant
applications.
Students will be working on the previously introduced AI Applications, with specific models:
- Predictive Maintenance
- Recommendation System
- Viral Post Prediction
- Employee Attrition Prediction
- Insurance Fraud Detection
- Quality Assurance System

Learning Outcomes:
Students aspire to use AI models to solve various industry applications using Python.

Pre-requisites:
1. Module 9 - Introduction to Programming using Python
2. Module 10 – Python functions & packages (Numpy, Pandas, Scikit Learn)

Key-concepts:
1. How AI solves real world problems
2. Confidence in selecting an AI model for a particular problem

Material used:
1. [Slides] Module 11.pptx.
2. [Notebook - Facilitator] Module 11 (Predictive Maintenance).ipynb
3. [Notebook - Student] Module 11 (Predictive Maintenance).ipynb
4. [Dataset]_Module11_Train_(Maintenance).csv
5. [Dataset]_Module11_Test_(Maintenance).csv
6. [Notebook - Facilitator] Module 11 (Recommendation System).ipynb
7. [Notebook - Student] Module 11 (Recommendation System).ipynb
8. [Dataset]_Module11_ (Recommendation)csv
9. [Notebook - Facilitator] Module 11 (Viral Post Prediction).ipynb
10. [Notebook - Student] Module 11 (Viral Post Prediction).ipynb
11. [Dataset]_Module11_ (Viral).csv
12. [Notebook - Facilitator] Module 11 (Employee Attrition Prediction).ipynb
13. [Notebook - Student] Module 11 (Employee Attrition Prediction).ipynb
14. [Dataset]_Module11_Train_(Employee).csv
15. [Dataset]_Module11_Test_(Employee).csv
16. [Notebook - Facilitator] Module 11 (Insurance Fraud Detection).ipynb
17. [Notebook - Student] Module 11 (Insurance Fraud Detection).ipynb
18. [Dataset]_Module11_(Insurance).csv
19. [Notebook - Facilitator] Module 11 (Quality Assurance System).ipynb
20. [Notebook - Student] Module 11 (Quality Assurance System).ipynb
21. [Dataset]_Module11_(QualityAssuranceSystem)_Crack.json

AI Ethics issues discussed:


Nil

Application to real life scenario [e.g. social issues, industry relevance, etc.]:
1. AI Predictive Maintenance for Machines using Linear Regression
2. Shopping Basket Recommendation System using KNN model
3. Social Media Virality Prediction using K-Means
4. Employee Attrition Rate Prediction using Linear Regression
5. Insurance Fraud Detection using Random Forest
6. Image Outlier Detection using Artificial Neural Network
1. Lesson Overview
Note: The lesson overview gives an overall vision of how the sessions are timed for optimal
delivery of content. It is to be noted that the session timings are not to be strictly followed and
are only an estimate. You can deliver the sessions at your own pace in accordance with the class.
No Time Activity Description Objective
10 Recap of previous Reiterate the highlights of
1 Recap
mins module. previous module.
Facilitator to give
25 instructions on how to Help Students’ in setting up
Setting up Anaconda
mins download Anaconda and python coding environment
2 set up Jupyter notebook. and involve them in AI
Group discussion on the discussion session.
15
Discussion on AI usage common trade
mins
applications of AI.
50 Explore Data Brief introduction to Data
mins Visualization Visualization.
Brief introduction to
40 Brush up data visualization
Explore AI Models supervised and
3 mins and model evaluation
unsupervised learning.
concepts.
Introduction to evaluation
40 Explore evaluation
methods followed by
mins methods
practice materials.
AI Predictive
210 Trade applications:
4 Maintenance using Linear Learn trade application of AI.
mins Linear Regression
Regression.
Shopping Basket
210 Trade applications:
5 Recommendation System Learn trade application of AI.
mins KNN
using KNN model.
Social Media Virality
210 Trade applications: K-
6 Prediction using K- Learn trade application of AI.
mins Means
Means.
Employee Attrition Rate
210 Trade applications:
7 Prediction using Linear Learn trade application of AI.
mins Linear Regression
Regression.
Insurance Fraud
210 Trade applications:
8 Detection using Random Learn trade application of AI.
mins Random Forest
Forest.
Image Outlier Detection
210 Trade applications:
9 using Artificial Neural Learn trade application of AI.
mins ANN
Network.
2. Session Preparation
Slides: [Slides] Module 11
Logistics: For a class of 40

Item Quantity

Laptops 40
3. Activities Guide
Note: The activities guide lays down sample guidelines on how to interact with the students and
deliver the contents in the slide deck. You are welcome to modify any of the slide notes,
suggestions for better delivery of content.
[Slide 1 to 3] Reiterate the highlights of previous module [10 mins]
The purpose of this section is to reiterate what the student has previously learned.

[Slide 1]
Welcome to the AI for Future Workforce program. My name is XXX and I am your facilitator of
this program.
[Slide 2]
Legal Disclaimer.

[Slide 3]
For today’s lesson, we will focus on these areas:
How do we set up Jupyter notebook in an Anaconda environment?
A basic recap of machine learning concepts
Trade applications of AI
[Slide 4 to 12] How do you set up Jupyter Notebook and how is AI used commonly [40 mins]
The purpose of this section is to help students in setting up Python environment and
starting Jupyter notebooks.

[Slide 4]
Now that we’re ready, let’s set up Jupyter. For the initial part of the lesson, we will talk about
“installing Anaconda and setting up Jupyter notebook”.

[Slide 5]
So, to download Anaconda, first you need to go to
www.anaconda.com/products/individual and go to “Download”. Then click on the 64-Bit
Graphical Installer. You are good to go.

[Slide 6]
After Anaconda is downloaded properly, go ahead, and open it. You can see a window like this.

[Slide 7]
Click on Launch in the Jupyter Notebook section.

[Slide 8]
Wait for it. The Jupyter notebook will open in a default browser.

[Slide 9]
Now, click on new, click on Python 3 and you will see a notebook being created.

[Slide 10]
A new Jupyter notebook looks like this.

[Slide 11]
SAY:
Ok everyone. Divide yourself into 4 teams. Now think about what are some of the common trade
applications of AI you can think of?
(Divide the whole class into 4 groups as it is a group activity.)
(Urge them to think about AI applications)
[Slide 12]
Here are some of the possible applications. We can use AI in predictive maintenance of
appliances. We can also use AI in Recommendation System for shopping. We can also use AI in
detecting insurance frauds. For more applications, you can check out Kaggle!
https://www.kaggle.com/competitions
[Slide 13 to 40] How do you visualize data and evaluate [130 mins]
The purpose of this section is to help students learn about AI models, learn about data
visualization and evaluation methods

[Slide 13]
Let’s dive into data visualization and practice generating the graphs on python.

[Slide 14]
Data visualization is important for us to be able to identify outliers or errors.

[Slide 15]
Example of how a scatter plot looks like.

[Slide 16]
Example of how a bar chart looks like.

[Slide 17]
Example of how a histogram looks like.

[Slide 18]
Example of how a box plot looks like.

[Slide 19]
Let’s revise what types of AI models are there.

[Slide 20]
Machine Learning techniques can be split into Supervised, Unsupervised and Reinforced
Learning. This diagram shows the types of applications that each algorithm can be used for.

[Slide 21]
Do you remember what is supervised learning?
(Urge them to respond.)
[Slide 22]
The goal of supervised learning is to find specific relationships or structures in the input data that
allow us to effectively produce correct output data. For supervised learning, data must be
labelled. The algorithms in supervised learning predict the output from the input data. There are
two types of supervised learning algorithm. The first type is classification, where it maps input
data to output labels. The second type is regression, where it maps input data to a continuous
output.

[Slide 23]
Do you remember what is unsupervised learning?
(Urge them to respond.)

[Slide 24]
Unsupervised learning is very useful in exploratory analysis because it can automatically identify
structure in data. For example, if an analyst were trying to segment consumers, unsupervised
clustering methods would be a great starting point for their analysis. In situations where it is
either impossible or impractical for a human to propose trends in the data, unsupervised learning
can provide initial insights that can then be used to test individual hypotheses.
In unsupervised learning, the data is unlabelled. We can group the algorithms in unsupervised
learning to clustering and dimensionality reduction. Clustering algorithms learn relationship
between individual features and dimensionality reduction algorithms are used to represent data
using less columns or features.

[Slide 25]
You are always using some machine learning models for your purposes. In order to understand
how your models are performing, you need evaluation methods. Evaluation methods tell you
how good your models are and how accurately they are predicting.
How do we measure the performance of our model?

[Slide 26]
Firstly, we will go through several terms that are very important to the evaluation process.
In this example, we are trying to predict forest fire based on weather data.
There are 2 possible results that the prediction algorithm might give us. It may predict that
there’s a fire, or it may predict that there’s no fire.
[Slide 27]
True Positives (TP): Predicted yes (detector says that there is a fire), and in fact is yes (there is a
fire).
This is a good result. Forest firefighter will be called and they will fight the fire.

[Slide 28]
True Negatives (TN): Predicted no (detector says that there is no fire), and in fact is no (there is
no fire).
This is good, no fire is happening and everyone can relax.

[Slide 29]
False Negatives (FN): Predicted no (detector says that there is no fire), but in fact is yes (there is
a fire).
This result is very dangerous! Imagine realising the fire only when it is too late!

[Slide 30]
False Positives (FP): Predicted yes (detector says that there is a fire), but in fact is no (there is no
fire).
This too is an inaccurate result. Firefighter will be dispatched only to find that there is no fire.
However, this is not as dangerous as the false negative case, as shown previously.

[Slide 31]
The result of comparison between the prediction and reality can be recorded in what we call the
confusion matrix
A confusion matrix allows us to understand the prediction results. It is not an evaluation metric.
True Positives (TP): Predicted yes (detector says that there is a fire), and in fact is yes (there is a
fire).
True Negatives (TN): Predicted no (detector says that there is no fire), and in fact is no (there is
no fire).
False Positives (FP): Predicted yes (detector says that there is a fire), but in fact is no (there is no
fire).
False Negatives (FN): Predicted no (detector says that there is no fire), but in fact is yes (there is
a fire).
[Slide 32]
Accuracy is the percentage of correct prediction out of all the observations.
This means, given a set of predictions and observations, what is the percentage of the predictions
that is TRUE?
Look at the formula carefully. The numerator is made up of all the prediction that is true. The
denominator is made up of the sum of all predictions.
Do you think high accuracy is equivalent to good performance of a model?
What do you think is a reasonable percentage of a model with good performance?
Urge them to respond.

[Slide 33]
A high accuracy score does not mean that the model is a good one.
Imagine that, there’s a 2 percent probability that there is a forest fire in a year.
We can create a model that always predict that there is NO fire.
It will be 98% correct, but will it be usable? Probably not!
Maybe it is more useful to calculate the percentage of cases that are predicted as positive that are
positive.

[Slide 34]
Precision is the percentage of cases that are predicted as positive that are positive.
Imagine: Every time our model predicts that there will be fire (positive), the firefighter will
check for the fire.
It might turn out that there’s fire OR there is no fire.
What is a common term for FP? When there’s a fire alarm while there’s no fire?
[Student’s response]
Yes, this is called false alarm.
Precision is important.
Imagine if there’s low precision. What will happen? The firefighter may get complacent. There
are too many false alarms!
Remember the ‘boy who cried wolf?’
[Slide 35]
Recall measures the fraction of positive cases that are correctly identified.
In other words, how many times does the machine predict there’s a fire when there’s indeed a
fire?
We want to get this as high as possible. Remember the False negative?
This is the case when the machine predicts that there is no fire, when there’s fire!
FN and FP, which one will you favour?

[Slide 36]
Let’s do some practice questions.

[Slide 37]
Normal example. Calculate the accuracy.

[Slide 38]
Calculate accuracy, precision, and recall. Example where accuracy may not be the best metric.

[Slide 39]
Calculate TP, TN, FN, FP. Example where precision may not be the best metric.

[Slide 40]
Calculate the accuracy, TP, TN, FP, FN, precision, and recall. Example where recall may not be
the best metric.
[Slide 41 to 54] How do you apply AI to real world trade applications [210 mins]
The purpose of this section is to introduce students to trade applications of AI, and learn
about how to use linear regression to do predictive maintenance.

[Slide 41]
Artificial intelligence is commonly used in various spheres to automate processes, gather
insights, and speed up processes. You will use Python to study interesting use cases of artificial
intelligence. We will now look at some of the common trade applications of AI.

[Slide 42]
The first use case that you would be looking at is AI Predictive Maintenance using linear
regression.

[Slide 43]
Let’s do an activity to understand linear regression.

[Slide 44]
Now we will go to http://www.shodor.org/interactivate/activities/Regression/
Put any three points you like, for instance
12
23
34
Click on display line of best fit.
You can see a red line through the points.
Discussion:
Points can be added as pairs like (1,2), (2,3)
What is line of best fit?
The line of best fit is a straight line that contains the greatest number of points that you entered.
Great! We have seen how linear regression works.
[Direct the students to visit:
http://www.shodor.org/interactivate/activities/Regression/]
[Slide 45]
We will be working with aircraft sensor data, obtained from Github. The text files contain
simulated aircraft engine run-to-failure events, operational settings, and 21 sensors
measurements are provided by Microsoft. It is assumed that the engine progressing degradation
pattern is reflected in its sensor measurements.

[Slide 46]
To do this exercise, let’s open our Jupyter notebook. The title of the notebook is [Notebook -
Student] Module 11 (Predictive Maintenance).ipynb.
Let us import the libraries first.

[Slide 47]
Now, let’s load the dataset. The read_csv function from pandas is used to load csv files that
contain data. Here we load the file
[Dataset]_Module11_Train_(Maintenance).csv

[Slide 48]
We can obtain information like mean, median and mode of variables in the dataset using the
describe function. These values help us understand how the dataset is distributed and how to deal
with the data (especially if the data is imbalanced).

[Slide 49]
Let’s plot the different features using a bar chart to view the distribution. The bars show the
deviation or spread of various features. If the length of the bar is high, the spread of the variable
is high. For instance, if a variable a is between 20 and 60 and variable b is between 10 and 100,
variable b has a larger spread.

[Slide 50]
Now, we would view the correlation matrix which shows relation between features. Pick any
square you like. If the color of the square is light, the corresponding features are highly related. If
the color of the square is dark, the corresponding features are less related.

[Slide 51]
Regression model would be made using some features of the dataset that we saw earlier. We will
choose a subset of the dataset and store the features in features_orig. X_train holds the features
from the dataset, y_train the corresponding labels, X_test and y_test the test data.
[Slide 52]
linear_model.LinearRegression() is the command that helps us train the model by passing the
training data and training labels, X_train and y_train respectively. linreg.predict is used to predict
using the trained model. You can also see the root mean squared error, mean squared error,
which are used measure how good the model is.

[Slide 53]
The parameters of the model setting2, setting1, s1, s2 etc. are plotted here. You can see that red
lines for setting2 and s6 are the longest. So, they are the most important.

[Slide 54]
Now discuss about what you have learnt about linear regression
[Slide 55 to 68] How do you apply AI to real world trade applications [210 mins]
The purpose of this section is to introduce students to trade applications of AI, learn about
how to use KNN to do shopping basket recommendation.

[Slide 55]
Now, let us involve ourselves in an interesting use case. All of you should have bought stuff
from ecommerce websites like Amazon. You would have noticed that they often recommend to
you stuff that you might buy and you do buy them sometimes. Today we would build a shopping
recommendation system with the help of AI.

[Slide 56]
Now we will go to http://vision.stanford.edu/teaching/cs231n-demos/knn/
You can see how KNN works.
Discussion:
- White area: undefined. Need more data
- Why go through multiple k?
Great! We have seen how the K-Nearest Neighbor works.
[Direct students to the website: http://vision.stanford.edu/teaching/cs231n-demos/knn/]

[Slide 57]
K-Nearest Neighbor, or commonly called KNN, can be used in classification or regression
problems. It relies on the surrounding points or neighbors to determine its class or group. It
utilises the properties of the nearest points to decide how to classify unknown points. It is based
on the concept that similar data points should be close to each other.

[Slide 58]
For example, I want to predict the sweetness of fruit X with respect to the plant height. I have
some data, where green represents a fruit that is sweet and blue represents a fruit that is not
sweet. With k-nearest neighbor, if k=1, it will form a circle that fits one data point closest to X as
shown in the graph. Thus, it would predict that the fruit is not sweet. If k=2, you can see that
there is 1 vote for sweet and 1 for not sweet. If k=3, you will see that there are 2 green and 1 blue
data point, thus predicting that the fruit X is sweet.

[Slide 59]
Artificial intelligence is commonly used in ecommerce applications. You will use Python to
develop a KNN model to recommend product to users based on their buying habits.
[Slide 60]
To do this exercise, let’s open our Jupyter notebook. The title of the notebook is [Notebook -
Student] Module 11 (Shopping Basket Recommendation System).ipynb.
We would import the libraries here. Numpy for numerical calculations, scikit learn for AI and
matplotlib for plotting.

[Slide 61]
Now, let’s load the dataset. The read_csv function from pandas is used to load csv files that
contain data. Here we load the file [Dataset]_Module11_ (Recommendation).csv

[Slide 62]
We can obtain information like mean, median and mode of variables in the dataset using the
describe function. These values help us understand how the dataset is distributed and how to deal
with the data (especially if the data is imbalanced)

[Slide 63]
Sometimes we do not need the whole dataset for calculation as it may be bulky. In those cases,
we may select a part of the dataset and work with that. Here we are dropping the timestamp
column from the dataset and working with the resulting dataset.

[Slide 64]
We would now visualize the data using the quantiles. Quantiles divide a distribution into equal
parts. For example, Percentiles are quantiles that divide a distribution into 100 equal parts.

[Slide 65]
The final dataset is obtained based on popularity ratings. We are selecting only the products for
which ratings are more than 50. The resulting dataset is new_df. Train_test_split is for dividing
the data into train and test set. Here the dataset is divided such that the training data is 70% and
test data is 30%.

[Slide 66]
Now comes the prediction part. We are choosing the top 10000 entries in the dataset for
prediction.

[Slide 67]
We fit the data to our model here.
[Slide 68]
Finally, we can print recommendations here. For instance, the customer who bought the item
“B00000K135” is recommended items 7214047977, 9575871979 etc
[Slide 69 to 75] How do you apply AI to real world trade applications [210 mins]
The purpose of this section is to introduce students to trade applications of AI, and learn
about how to use K-Means to do social media virality prediction.

[Slide 69]
Our third use case is detecting fraud in insurance cases with the use of AI.

[Slide 70]
K-Means is a simple algorithm that divides a dataset into groups. Objects in the same group
(called a cluster) are more similar (in some sense) to each other than to those in other groups
(clusters). The data on the left has not been grouped into clusters. After applying K-Means, the
data is divided into 3 clusters colored red, green, and blue. We will see how social media virality
can be predicted using K-Means algorithm.

[Slide 71]
To do this exercise, let’s open our Jupyter notebook. The title of the notebook is [Notebook -
Student] Module 11 (Social Media Virality).ipynb.
We would import the libraries here. Numpy for numerical calculations, scikit learn for AI and
matplotlib for plotting.

[Slide 72]
Now, let’s load the dataset. The read_csv function from pandas is used to load csv files that
contain data. We load the file [Dataset]_Module11_ (Viral).csv
Here, the url column contains names of the websites from which the articles are collected and we
won’t need them for grouping the articles. We will need all the columns that have numbers. So,
we drop the url column from the dataset.

[Slide 73]
The idea is to divide the set of articles into groups. Articles within a single group will have
similar possibility of getting viral.
Now, let’s apply the K-Means algorithm. The k here means the number of groups we are
dividing the dataset into. If k is 3, we will divide the set of mashable articles into 3 groups. Here
in this slide, we tried to see how the results look if we take 4 clusters. The results are clearly not
good as the yellow and the purple clusters are overlapping. Clustering is good if the clusters can
be separated from each other.
[Slide 74]
Since we did not get good results earlier, we are trying a smaller dataset here. We took only the
first 2 columns of the dataset.
We are applying K-Means here with 3 clusters. The clustering results are quite good here as we
can see the clusters clearly- the violet cluster, the yellow and the blue. All the articles represented
by the violet colored clusters have equal probability of going viral and we can say this for the
other clusters as well

[Slide 75]
Let us see another example here. We take the first 10 columns of the dataset here and apply K-
Means with 6 clusters.

The clustering results are quite good here too as we can see the clusters separately
[Slide 76 to 82] How do you apply AI to real world trade applications [210 mins]
The purpose of this section is to introduce students to trade applications of AI, and learn
about how to predict employee attrition rate using linear regression.

[Slide 76]
This use case is detecting employee attrition rate or how often a company can retain their
employees using linear regression.

[Slide 77]
To do this exercise, let’s open our Jupyter notebook. The title of the notebook is [Notebook -
Student] Module 11 (Predicting Employee Attrition Rate).ipynb.
We would import the libraries here. Numpy for numerical calculations, scikit learn for AI and
matplotlib for plotting.

[Slide 78]
Now, let’s load the dataset. The read_csv function from pandas is used to load csv files that
contain data. We load the files [Dataset]_Module11_Train_(Employee).csv and
[Dataset]_Module11_Test_(Employee).csv

[Slide 79]
Let us see how the data is distributed. We can visualize the mean, max, and min value of each
columns alongside other characteristics. We can obtain information like mean, median, and
mode of variables in the dataset using the describe function. These values help us understand
how the dataset is distributed and how to deal with the data (especially if the data is imbalanced)

[Slide 80]
We would visualize the data using a correlation matrix now. It says how related different features
are. If the value of a square is close to 0, the 2 corresponding features are not related at all.

[Slide 81]
Here you will see how training is done. Label has Attrition_rate which would be predicted by our
model. The features array contains all the columns that would be used for training. X has the
training data and Y has the labels. The train and test data are split here in the ratio 55:45.

[Slide 82]
Since the training is done, we will do the prediction now using the trained model.df.predict() is
used to predict. We also print the first 5 employee ids and their attrition rates.
[Slide 83 to 90] How do you apply AI to real world trade applications [210 mins]
The purpose of this section is to introduce students to trade applications of AI, and learn
about how to detect insurance frauds using random forests.

[Slide 83]
Our fifth use case is detecting fraud in insurance cases with the use of AI.

[Slide 84]
Random forest is a classification algorithm that combines the decision from many decision trees
and decides the class for a data point.

[Slide 85]
To do this exercise, let’s open our Jupyter notebook. The title of the notebook is [Notebook -
Student] Module 11 (Insurance Fraud Detection).ipynb.
We would import the libraries here. Numpy for numerical calculations, scikit learn for AI and
matplotlib for plotting.

[Slide 86]
Now, let’s load the dataset. The read_csv function from pandas is used to load csv files that
contain data. We load the file [Dataset]_Module11_ (Insurance).csv

[Slide 87]
We can obtain information like mean, median, and mode of variables in the dataset using the
describe function. These values help us understand how the dataset is distributed and how to deal
with the data (especially if the data is imbalanced).

[Slide 88]
A correlation matrix is a table showing correlation coefficients between variables. Each cell in
the table shows the correlation between two variables. A correlation matrix is used to summarize
data, as an input into a more advanced analysis, and as a diagnostic for advanced analyses.

[Slide 89]
The fraud_reported column is the target column or the output label. So, we create a new dataset
named features without the column fraud_reported. The data is then split into train and test data
and the training data is fit into the classifier.
[Slide 90]
Here we can see how the model performed- the accuracy, precision, and recall scores (as we had
studied earlier).
[Slide 91 to 105] How do you apply AI to real world trade applications [210 mins]
The purpose of this section is to introduce students to trade applications of AI, and learn
about how to detect image outliers using Artificial Neural Network.

[Slide 91]
Our final use case is image outlier detection. We will have lots of images of walls. Some of them
have cracks. We need to use AI to find which walls are with cracks and we would visualize
them.

[Slide 92]
ANN is a component of artificial intelligence that is meant to simulate the functioning of a
human brain.
ANNs were made to mimic the human brain so that computers can replicate thinking like
humans.

[Slide 93]
Let’s do an activity to understand artificial neural networks. Please go to:
https://playground.tensorflow.org/ and play around with the network for 5 minutes
[Direct the students to go to https://playground.tensorflow.org/ and do the activity.]

[Slide 94]
Think about how neural networks behave and try to appreciate how they think. Neural networks
can replicate human behaviors by using a few blocks of code. Isn’t that fascinating?

[Slide 95]
To do this exercise, let’s open our Jupyter notebook. The title of the notebook is [Notebook -
Student] Module 11 (Quality Assurance System).ipynb.
We would import the libraries here. Numpy for numerical calculations, scikit learn for AI and
matplotlib for plotting.

[Slide 96]
Now, let’s load the dataset. The train and test set are loaded in the folders
[Dataset]Module11TrainReducedQualityAssuranceSystem and
[Dataset]Module11TestQualityAssuranceSystem.
[Slide 97]
This snippet of code is used to download the images. The images are in the respective training
and test folders and used from those locations.

[Slide 98]
The dataset is split into training and test sets. Here, VGG model is used to train the network.
VGG is a convolutional neural network model used for classifying images.

[Slide 99]
Training is done here using the VGG model. It will take some time to train. Just run it and wait.

[Slide 100]
Here we are measuring the performance of the model. We can see how well it performs.

[Slide 101]
In this module, we learnt how AI can be used in insurance industry, shopping, and e-commerce
recommendations. Think about the usage of AI and find out some other areas where AI can be
used to solve problems.
[Slide 102] Reflection
Let us recap what we have learnt so far.
Today we have learnt how AI can help us in solving common trade problems by seeing AI
applications in aircraft industry, insurance industry etc. Think about how you use AI to impact
lives around you.

[Slide 103]
Thank you for your contribution today. I look forward to the next session with all of you.

[Slide 104] (Optional Slides)


Now that we’re ready, let’s set up Jupyter. For the initial part of the lesson, we will talk about
“installing Anaconda and setting up Jupyter notebook”.

[Slide 105]
Go to https://colab.research.google.com/

[Slide 106]
Click on File and then on New notebook.
It would ask you to sign with a new Google account.

[Slide 107]
A new notebook in the colab environment looks like this.

[Slide 108]
To run a code block, click on any of the cells. For example, here the code block says: ”import
pandas as pd” Click Shift+Enter to run this block or click the play icon.

[Slide 109]
To upload any file click on the folder icon that is inside the highlighted black box here. This
would allow you to upload required CSV files for operation. After you click this icon, follow the
next slide.

[Slide 110]
Click on the icon that is shown here inside the black box. Then a dialog will appear for file
upload. The notebooks provided can be run as they are in the colab environment.
4. Troubleshooting Tips

Common Hardware Mistakes/ Issues

Mistakes/ issues Possible reasons Resolution

Try installing using both


Installing python Different python environments
1. options- using pip installer
libraries can be set up differently
and conda installer
5. Remote Learning

Activity Key Differences Remote Facilitation Tips

Instead of doing in groups, the


students will have to do the task More time required for the
Group discussion
individually. It is completely students to practice, and ask
and creating demo
possible to do the tasks question.
individually.

The notebooks can be run on


google colab for remote
experience without any trouble.
Code must be shared to the students
Only the input csv files need to
and there must be a method to
Coding notebooks be uploaded to the colab
facilitate the coding experience
environment. The instructions
remotely
for setting up the Colab
environment can be found in the
slide deck (Slides 105-111)
6. Blended Learning for Students

Activity Description Recommended Time

Students to read more on using


machine learning algorithms:
https://towardsdatascience.com/do-
Readings 15 minutes
you-know-how-to-choose-the-right-
machine-learning-algorithm-among-
7-different-types-295d0b0c7f60

Student to read more about


algorithms:
Readings 45 minutes
https://ml-cheatsheet.readthedocs.io/
en/latest/
7. Suggested Reading for Facilitators

Material Suggested Links Recommended Usage

Difference between
Supervised Learning https://www.youtube.com/watch? General viewing on types of AI
and Unsupervised v=kE5QZ8G_78c models
Learning

But what is neural https://www.youtube.com/watch?


Overview of Neural Networks
network? v=aircAruvnKk
8. Bibliography
1. [Regression]. (2020). Interactive. https://shodor.org/interactivate/activities/Regression
2. [K-Nearest Neighbors Demo]. (2020). K-Nearest Neighbors Demo. https://vision.stanford.edu/teaching/
3. [Tensorflow Playground] (2020). Tinker With a Neural Network Right Here in Your Browser.

You might also like