ML Lab Notes

Numpy – support large, multi-dimensional arrays and matrices, collection of mathematical
functions to operate on this arrays.
Precision and Recall – are model evaluation metrics.
Precion refers to percentage of results that are relevant ,

true positive
true positive + false positive
recall refers to percentage of total relevant results correctly classified by our algo
true positive
true positive + false negative
Sensitivity
measure proportion of actual positive cases that got predicted as positive
MLP(Multilayer Perceptron)
class of feed forward artificial neural network
it utilises a supervised learning technique called backpropagation for training
Activation function
decides whether a neuron shouls be activated or not ( decides whether the neurons input is
important to the network or not in the process of prediction using simpler mathematical operations.)
Linear and Logistic Regression

linear regression used for solving regression probems . Least square estimation is used for finding
accuracy
logistic regression used for solving classification problems( binary classification problem).
Maximum likelihood method is used for estimation of accuracy.
Backpropagation
used to change the weights of neural net based on error rate obtained in previous epoch.
Forward Propagation
in neural networks we forward propagate to get the output and compare it with the real value to get
the error
Perceptron
single layer neural network
Weights
shows the strength of particular node
Bias
allows to shift the activation function curve up and down
Unified learning Algorithm/Perceptron Learning rule

algorithm that learns the optimal weight coefficients.
PCA(Principal component Analysis)

used in unsupervised learning
dimensionality reduction method
geometrically it represents directions of data that explain a maximal amount of variance.
Steps-
Standardisation
transform all variables to one scale. It reduces the biasness among variables
Z= value-mean /
standard deviation
covariance matrix compuattaion

used to see relationship btwn variables
compute eigen vectors and eigen values of cov matrix to identify PC
feature vector
choose whether to keep all components or discrad the ones of lesser significance
recast the data along principle component analysis
LDA(linear discriminant analysis)

used in supervised classification problem
dimensionality reduction technique
2 criterias used by LDA to create new axis-

maximise distance between means of 2 classes
minimise variation within each class.
Statistical Testing
determine whether the random variable following null hypothesis or alternate hypotheseis
Null hypothesis- there is no significance difference between sample and population or among
different populations.
Hypothesis testing
evaluates the evidence data provides sgainst a hypothesis
T-test
used to compare means of two given samples
F-Test
used to compare standard deviation of two samples.
Prunning
Data compression technique that reduces the size of decision tree by removing sections of tree that
are non-critical and redundant to classify instances.
Error reduced prunning

partition training data into ‘grow” and “validation” set and build a complete tree for grow set
for each non-leaf node in the tree, temporarily prune the tree below and then test the accuracy of
hypothesis on validation set.
If the accuracy increases permanently prune the node.
Post pruning
grow full tree and then remove nodes
Pre pruning
stop growing when data split not statistically significant.
ID3 (Iterative Dichotomiser)

classification algo
follows greedy approach of building a decision tree by selecting best attribute that yields-
maximum information gain
minimum entropy
batch gradient Descent

use all training samples for one forward pass and then adjust weights
good for small training set
stochastic Gradient Descent

use one randomly picked sample for forward pass and then adjust weights.
Good for large training set.
Takes less time than batch gradient descent
Entropy
measures uncertainity, purity and informatioon content
why decision tree is supervised?
Naive bayes assumptions

feature makes independent and equal contribution to outcome
SVM
supervised model used for classification and regression problems
works well when there is understandable margin of dissociation between classes
more productive in high dimensional spaces
not acceptable for large datasets.
Kernel trick
is a simple method where a non linear data is projected onto a higher dimension space so as to make
it easier to classify the data where it could be linearly divided by a plane.
Kernel Function
is a method to take data as input and transform into the required form of processing data.
Reinforcement Learning
training method based on rewarding desired behaviours and/or punishing undesired ones.
Learning agent interpret the environment, take actions and learn through trial and error.
Poisson Distribution
measures the probability of a given number of events happening in a specified time period
Random Forest
supervised learning
used for both regression and classification
it builds multiple decision trees and merges them together to get a more accurate and stable
prediction
Bagging
used to reduce variance within noisy dataset
Bagging: It is a homogeneous weak learners’ model that learns from each other independently
in parallel and combines them for determining the model average.
1.Boosting: It is also a homogeneous weak learners’ model but works differently from
Bagging. In this model, learners learn sequentially and adaptively to improve model
predictions of a learning algorithm.

ML Lab Notes

Uploaded by

Copyright:

Available Formats

You might also like

ML Lab Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Lab Notes

Uploaded by

Copyright:

Available Formats

Numpy – support large, multi-dimensional arrays and matrices, collection of mathematical

functions to operate on this arrays.

Precision and Recall – are model evaluation metrics.

Precion refers to percentage of results that are relevant ,

Linear and Logistic Regression

Unified learning Algorithm/Perceptron Learning rule

PCA(Principal component Analysis)

geometrically it represents directions of data that explain a maximal amount of variance.

covariance matrix compuattaion

compute eigen vectors and eigen values of cov matrix to identify PC

recast the data along principle component analysis

LDA(linear discriminant analysis)

2 criterias used by LDA to create new axis-

Error reduced prunning

ID3 (Iterative Dichotomiser)

batch gradient Descent

stochastic Gradient Descent

Naive bayes assumptions

not acceptable for large datasets.

You might also like