Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Ch:5 Basics of Machine Learning

Introduction, Types of Learning (supervised learning, unsupervised learning,

and reinforcement learning), Hypothesis Space, Inductive Bias, Evaluation and

Cross Validation, Linear Regression, Decision Trees, Learning Decision Trees,

K-nearest Neighbor, Collaborative Filtering, Overfitting, Methods to remove

overfitting problem

Hypothesis in Machine Learning (ML):

The hypothesis is one of the commonly used concepts of statistics in Machine Learning. It is
specifically used in Supervised Machine learning, where an ML model learns a function that best
maps the input to corresponding outputs with the help of an available dataset.

Hypothesis space (H):

Hypothesis space is defined as a set of all possible legal hypotheses; hence it is also known as a
hypothesis set. It is used by supervised machine learning algorithms to determine the best possible
hypothesis to describe the target function or best maps input to output.

It is often constrained by choice of the framing of the problem, the choice of model, and the choice
of model configuration.

Inductive Bias: Note: Same as Assumptions covered in MLE

 The inductive bias (also known as learning bias) of a learning algorithm is the set of
assumptions that the learner uses to predict outputs of given inputs that it has not
encountered.
 In machine learning, one aims to construct algorithms that are able to learn to predict a
certain target output.
 To achieve this, the learning algorithm is presented some training examples that
demonstrate the intended relation of input and output values.
 Then the learner is supposed to approximate the correct output, even for examples that
have not been shown during training.
 Without any additional assumptions, this problem cannot be solved since unseen
situations might have an arbitrary output value.
 The kind of necessary assumptions about the nature of the target function are subsumed
in the phrase inductive bias.
 For Example:
 Linear Regression algorithm assumes linearity, normality, and homoscedasticity (equal
variance)
 Naive Bayes assumes that the data is Normally distributed, and conditional independence
exists between the independent features.
 K-NN makes the assumption that the data points close together will be similar, hence the
new data point/unknown data will be placed with the majority of the neighbors.
 Support Vector Machines assume that the margins should be large

Evaluation and Cross Validation

 LOOCV and k-Fold

Methods to remove overfitting problem

There are several techniques to avoid overfitting in Machine Learning altogether listed below.

1. Training With More Data


2. Removing Features
3. Early Stopping
4. Regularization
5. Ensembling

Collaborative Filtering

 Collaborative filtering is used by most recommendation systems to find similar patterns or


information of the users, this technique can filter out items that users like on the basis of
the ratings or reactions by similar users.
 An example of collaborative filtering can be to predict the rating of a particular user based
on user ratings for other movies and others’ ratings for all movies. This concept
is widely used in recommending movies, news, applications, and so many other items.
 Let’s take one example and understand more about what is Collaborative Filtering,
 let’s assume I have user U1, who likes movies m1,m2,m4. user U2 who likes movies
m1,m3,m4, and user U3 who likes movie m1.
 So our job is to recommend which are the new movie to watch for the user U3 next.
 So here we can see users U1, U2, U3 watch/likes movies m1, so three have the same taste.
now in user U1, U2 has like/watch movies m4, so user U3 could like movie m4 so I
recommend movie m4, this is the flow of logic.

Ref: https://www.analyticsvidhya.com/blog/2022/02/introduction-to-collaborative-filtering/

You might also like