ML Bu

Overview
Examples
 Handwriting recognition learning problem
 Task T : Recognizing and classifying handwritten words within images
 Performance P : Percent of words correctly classified
 Training experience E : A dataset of handwritten words with given classifications
 A robot driving learning problem
 Task T : Driving on highways using vision sensors
 Performance P : Average distance travelled before an error
 Training experience E : A sequence of images and steering commands recorded while observing a
human driver
 8,9, 23, 24, 26, 41

Classification of Machine Learning
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Classification
Regression
Clustering
Challenges of Machine Learning!
 Training data.
 Poor Quality of data.
 Irrelevant features.
 Imperfections in the Algorithm When Data

Grows
 Overfitting and Underfitting.

Training Data
Let’s say for a child, to make him learn what an apple is, all it takes for you to point to an apple and say apple
repeatedly. Now the child can recognize all sorts of apples.
Well, machine learning is still not up to that level yet; it takes a lot of data for most of the algorithms to function
properly. For a simple task, it needs thousands of examples to make something out of it, and for advanced tasks
like image or speech recognition, it may need lakhs(millions) of examples.

Poor Quality of Data
Data plays a significant role in the machine learning process. One of the significant issues that machine
learning professionals face is the absence of good quality data. Unclean and noisy data can make the whole
process extremely exhausting.
The training data has lots of errors, outliers, and noise, it will make it impossible for your machine learning
model to detect a proper underlying pattern. Hence, it will not perform well.
Irrelevant Features
Training data must always contain more relevant and less to none irrelevant features.
The credit for a successful machine learning project goes to coming up with a good set of features on which it
has been trained (often referred to as feature engineering ), which includes feature selection, extraction, and
creating new features which are other interesting topics to be covered in upcoming blogs.
Imperfections in the Algorithm When Data Grows
The best model of the present may become inaccurate in the coming Future and require further
rearrangement. So you need regular monitoring and maintenance to keep the algorithm working.
This is one of the most exhausting issues faced by machine learning professionals.
Overfitting
In machine learning, we call this overfitting i.e model performs well on training data but fails to generalize well.
Overfitting happens when our model is too complex.
Things which we can do to overcome this problem:
 Simplify the model by selecting one with fewer parameters.

Say one day you are walking down a street to buy something, a dog comes
 By reducing the number of attributes in training data. out of nowhere you offer him something to eat but instead of eating he starts
barking and chasing you but somehow you are safe. After this particular
 Constraining the model. incident, you might think all dogs are not worth treating nicely.
So this overgeneralization is what we humans do most of the time, and
 Gather more training data. unfortunately machine learning model also does the same if not paid
attention.
 Reduce the noise.
Underfitting
This process occurs when data is unable to establish an accurate relationship between input and output variables
It happens when our model is too simple to learn something from the data. For E.G., you use a linear model on a set with multi-
collinearity it will for sure underfit and the predictions are bound to be inaccurate on the training set too.
Things which we can do to overcome this problem:
 Select a more advanced model, one with more parameters.
 Train on better and relevant features.
 Reduce the constraints.

Tip . . .
Machine learning is all set to bring a big bang transformation in technology. It is one of the most rapidly growing
technologies used in medical diagnosis, speech recognition, robotic training, product recommendations, video
surveillance, and this list goes on. This continuously evolving domain offers immense job satisfaction, excellent
opportunities, global exposure, and exorbitant salary. It is a high risk and a high return technology.
Session 3
‌ opular‌‌Machine‌‌Learning
P
Algorithms?‌
•Linear regression
•Logistic regression
•Decision tree
•SVM algorithm
•Naive Bayes algorithm
•KNN algorithm
•K-means
•Random forest algorithm
•Artificial neural networks (ANNs)
•C:\Users\91991\Desktop\BU ML alog.docx

ML Bu

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Bu

Uploaded by

Copyright:

Available Formats

Overview

 Task T : Recognizing and classifying handwritten words within images

 Performance P : Percent of words correctly classified

 Training experience E : A dataset of handwritten words with given classifications

 A robot driving learning problem

 Task T : Driving on highways using vision sensors

 Performance P : Average distance travelled before an error

 8,9, 23, 24, 26, 41

 Poor Quality of data.

 Imperfections in the Algorithm When Data

 Overfitting and Underfitting.

repeatedly. Now the child can recognize all sorts of apples.

like image or speech recognition, it may need lakhs(millions) of examples.

process extremely exhausting.

Overfitting happens when our model is too complex.

Things which we can do to overcome this problem:

 Simplify the model by selecting one with fewer parameters.

Things which we can do to overcome this problem:

 Select a more advanced model, one with more parameters.

 Train on better and relevant features.

 Reduce the constraints.

•Naive Bayes algorithm

•Random forest algorithm

•Artificial neural networks (ANNs)

You might also like