BCSE352E EDA CAT 2 Mod 1,2,5 PDF

BCSE352E- Essentials of
Data Analytics
1
Regression predicts a numerical value,
Topics in Module-1 instead of a “class”.
• Linear regression:
Classification
simple linear
regression
• Regression
Modelling
Clustering
Regression
• Correlation
• ANOVA
• Time Series
Forecasting
Associative
• Autocorrelation
2
Module-1 Topic-1: Linear regression
What is Regression?
Regression analysis is a way of

mathematically sorting out
which of those variables does
indeed have an impact.
• Regression analysis is a statistical method to model the relationship/correlation between a

dependent (target) and independent (predictor) variables with one or more independent
variables.
• Regression analysis helps us to understand how the value of the dependent variable is changing
corresponding to an independent variable when other independent variables are held fixed.
3
Classification Vs. Regression Module-1 Topic-1: Linear regression
• Regression algorithms predicts the discrete or a

continues value. In some cases, the predicted
value can be used to identify the linear
relationship between the attributes.
• Classification algorithms predicts the target class
(Yes/ No). If the trained model is for predicting
any of two target classes. It is known as binary
classification.
4
Different types of Regression
5
Assumptions of Regression analysis
• Linear regression with standard estimation technique makes numerous
assumptions about the independent variables and dependent variables.
• Following is the list of major assumptions made by linear regression model:
• Linearity:- Linear Regression model assumes that the dependent variable is
a linear combination of the regression coefficients and independent
variables.
• Lack of perfect multi-collinearity in independent variables-uses the
problem for linear regression models to estimate the relationship between a
dependent variable and independent variables, as correlated independent
variables change simultaneously.
• Multicollinearity reduces the power of linear regression models to
identify significantly important independent variables.
• Constant variance for different values of the dependent variable will
have the same variance in their error.
6
Regression analysis
The regression equation is given by
7
Solved Example-Finding the best fit line

• The data consist of samples
Works Assigned(X) 1 2 3 4
Total hours(Y) 3 4 5 7
X Y XY 𝑿𝟐 Y
8
1 3 3 1
7
2 4 8 4 6
3 5 15 9 5
4
4 7 28 16
3 Line of best fit
Sum=10 19 54 30 2 y = 1.5X+1.3
( σ 𝑦 ∗ (σ 𝑥 2 )) − σ 𝑥 σ 𝑥𝑦 (𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦 1
𝑎= 𝑏= 2
2
𝑛 σ𝑥 − σ𝑥 2 𝑛 σ𝑥 − σ𝑥 2
1 2 3 4 X
19 ∗ 30 −( 10 ∗ 54 ) 4 54 −( 10 ∗ 19 )
= =1.5 = =1.3 y = ax + b
4 30 − 10 2 4 30 − 10 2
y = 1.5X+1.3
8
Advantages and Disadvantages of Linear

Regression
Pros:
•Linear Regression is simple to implement.
•Less complexity compared to other algorithms.
•Linear Regression may lead to over-fitting but it can be avoided using
some dimensionality reduction techniques, regularization techniques, and
cross-validation.
Cons:
•Outliers affect this algorithm badly.
•It over-simplifies real-world problems by assuming a linear
relationship among the variables, hence not recommended for practical
use-cases.
9
Module-1 Topic-2: Regression Modelling
Best fit Regression analysis
• Aim of linear regression is to This will be the line that minimises distance between data and fitted
fit a straight line, ŷ = ax + b, line, i.e. the residuals.
ŷ = ax + b
to data that gives best
slope intercept
prediction of y for any value
of x. ε
• The objective is to find

values of a and b that
minimise
Σ (y – ŷ)2 = ŷ, predicted value
Residual (ε) = y - ŷ
= y i , true value
Sum of squares of residuals = Σ (y – ŷ)2 ε = residual error
10
Mathematics behind Regression analysis
• A low correlation
r sy r = correlation coefficient of x and y coefficient gives a flatter
Step-1: a= sy = standard deviation of y slope (small value of a)
sx sx = standard deviation of x
• Large spread of y, i.e. high
standard deviation, results
Step-2: y = ax + b b = y – ax in a steeper slope (high
value of a)
r sy r = correlation coefficient of x and y
• Large spread of x, i.e. high
Step-3: b=y- x sy = standard deviation of y
sx sx = standard deviation of x standard deviation, results
a b in a flatter slope (high value
r sy r sy of a)
Step-4: ŷ = ax + b = x+y- x
sx sx
a a
r sy
Step-5: ŷ = (x – x) + y
sx
11
Regression
95% confidence band
If the data points falls outside this
band, outliers are present.
i.e. some data points might give
better or worse than others.
Line of best fit Actual Value
Error
Outliers contains either very low value or very high value in comparison to Predicted value
other observed values, which may hamper the result. 12
Model Estimation and Evaluation
13
14
15
16
17
18
Module-1 Topic-3: Correlation
Correlation
19
Module-1 Topic-3: Correlation
Correlation
• Both covariance and correlation measure linear relationships between
variables. Examples: relationship between height and weight of children and
relationship between speed and weight of cars, etc.
• Since covariance is affected by a change in scale, it can take values between
−∞ and ∞. However, the correlation coefficient always lies between -1 and 1,
and it can be used to make statements and compare correlations.
• When the correlation coefficient is positive, an increase in one variable results
in an increase in the other.
• When the correlation coefficient is negative, an increase in one variable results
in a decrease in the other (i.e. the change happens in the opposite direction).
• A zero correlation coefficient indicates there is no relationship between the
two variables.
20
Basics of Scattergrams Module-1 Topic-3: Correlation
YY Y Y
X X
X
Positive correlation Negative correlation No correlation

• Since covariance is affected by a change in scale, it can take values between −∞ and ∞. However, the
correlation coefficient always lies between -1 and 1, and it can be used to make statements and compare
correlations.
• When the correlation coefficient is positive, an increase in one variable results in an increase in the other.
• When the correlation coefficient is negative, an increase in one variable results in a decrease in the other
(i.e. the change happens in the opposite direction).
• A zero correlation coefficient indicates there is no relationship between the two variables. 21
Module-1 Topic-4: ANOVA
• ANOVA is to test for differences among the means of the population by examining the amount of
variation within each sample, relative to the amount of variation between the samples.
• Analyzing variance test the hypothesis that the means of two or more populations are equal.
22
When to use Analysis of Variances (ANOVA) Module-1 Topic-4: ANOVA
23
How to use Analysis of Variances (ANOVA) Module-1 Topic-4: ANOVA
24
Analysis of Variances (ANOVA)
• Assumptions made:
(i) Samples are independent and randomly drawn from respective populations,
(ii) Populations are normally distributed, and
(iii) Variances of the population are equal
F0 = MSB / MSW
Sum-of-Squares-TReatments (SSTR)
Sum-of-Squares-Error (SSE)
Sum-of-Squares-Total (SST)
• SST gives the overall variance in the data, SSTR gives the part of the variation within the data due to
differences among the groups, and SSE gives the part of the variation within the data due to error.
• Note that SST = SSTR + SSE 25
ANOVA-Solved Example
Three different techniques namely medication, exercises and special diet are randomly assigned to
(individuals diagnosed with high blood pressure) lower the blood pressure. After four weeks the reduction
in each person’s blood pressure is recorded. Test at 5% level (Level of significance α = 0.05), whether there
is significant difference in mean reduction of blood pressure among the three techniques.
Step 1 : Hypotheses
Null Hypothesis: H0: µ1 = µ2 = µ3

That is, there is no significant difference among the three groups on the average reduction in blood pressure.
Alternative Hypothesis: H1: μ ≠ μ for atleast one pair (i, j); i, j = 1, 2, 3; i ≠ j.
i j
That is, there is significant difference in the average reduction in blood pressure in atleast one pair of
treatments.
26
Test statistic F0 = MST / MSE
27
28
Step 6 : Critical value
f(12, 2),0.05 = 3.89.
Step 7 : Decision
As F = 9.17 > f
0 = 3.89,
(12, 2),0.05
the null hypothesis is rejected.

Hence, we conclude that there
exists significant difference in
the reduction of the average
blood pressure in atleast one
pair of techniques.
29
Time Series Forecasting Module-1 Topic-5: Time Series Forecasting
• Time series forecasting means to forecast or to predict the future value over a period of time.
• It entails developing models based on previous data and applying them to make observations and guide future
strategic decisions.
• Key factors to be considered while using time series forecasting are
• Volume of data available — more data is often more helpful, offering greater opportunity for exploratory
data analysis, model testing and tuning, and model fidelity.
• Required time horizon of predictions — shorter time horizons are often easier to predict — with higher
confidence — than longer ones.
• Forecast update frequency — Forecasts might need to be updated frequently over time or might need to be
made once and remain static (updating forecasts as new information becomes available often results in
more accurate predictions).
• Forecast temporal frequency — Often forecasts can be made at lower or higher frequencies, which allows
harnessing downsampling and up-sampling of data (this in turn can offer benefits while modeling).
30
Time Series Forecasting Module-1 Topic-5: Time Series Forecasting
1) Seasonality: Seasonality is a simple term that means while predicting a time series data there are some
months in a particular domain where the output value is at a peak as compared to other months
2) Trend: The trend is also one of the important factors which describe that there is certainly increasing or
decreasing trend time series, which actually means the value of organization or sales over a period of time
and seasonality is increasing or decreasing.
3) Unexpected Events: Unexpected events mean some dynamic changes occur in an organization, or in the
market which cannot be captured.
31
Module-1 Topic-5: Time Series Forecasting
Time Series Forecasting-ARIMA Model
•Autoregressive integrated moving average
(ARIMA) models predict future values based
on past values.
•ARIMA makes use of lagged moving averages
to smooth time series data.
•They are widely used in technical analysis to
forecast future security prices.
•Autoregressive models implicitly assume that
the future will resemble the past. For ARIMA models, a standard notation would be
ARIMA with p, d, and q, where integer values
substitute for the parameters to indicate the type of
ARIMA model used. The parameters can be defined
as:
p: the number of lag observations in the model, also
known as the lag order.
d: the number of times the raw observations are
differenced; also known as the degree of differencing.
q: the size of the moving average window, also
known as the order of the moving average.
32
Autocorrelation Module-1 Topic-6: Autocorrelation
• Autocorrelation refers to the

degree of correlation of the same
variables between two successive
time intervals.
• It measures how the lagged
version of the value of a variable
is related to the original version of
it in a time series.
33
Autocorrelation Module-1 Topic-6: Autocorrelation
• Autocorrelation, also known as serial correlation, refers to the degree of correlation of the same
variables between two successive time intervals.
• The value of autocorrelation ranges from -1 to 1.
• A value between -1 and 0 represents negative autocorrelation. A value between 0 and 1 represents
positive autocorrelation.
Negative Autocorrelation Positive Autocorrelation No Autocorrelation
34
Module-1 Summary
Summary
• Regression: Regression analysis is a set of statistical processes for estimating
the relationships between a dependent variable and one or more variables.
• Correlation: Measures linear relationship between 2 variables
• ANOVA: Compares more than 2 population (uses F-statistic)
• Time Series Forecasting: Analysis and prediction of time-based data
• Autocorrelation: Measures linear relationship between lagged values
35
Topics in Module-2 Classification
Classification
• Logistic Regression
• Decision Trees
• Naïve Bayes-conditional probability
• Random Forest
• SVM Classifier
36
Module-2 Introduction to Classification
What is Classification?
Segregate vast quantities of data

into discrete values, i.e.
:distinct, like 0/1, True/False, or
a pre-defined output label class.
• The Classification algorithm is a Supervised Learning technique that is used to identify the
category of new observations on the basis of training data.
• In Classification, a program learns from the given dataset or observations and then classifies new
observation into a number of classes or groups.
37
Types of Classification?
• Types of Classifiers: The algorithm which implements the classification on a dataset is known as a
classifier. There are two types of Classifications:
• Binary Classifier: If the classification problem has only two possible outcomes, then it is
called as Binary Classifier. Examples: YES or NO, MALE or FEMALE, SPAM or NOT
SPAM, CAT or DOG, etc.
• Multi-class Classifier: If a classification problem has more than two outcomes, then it is called
as Multi-class Classifier. Example: Classifications of types of crops, Classification of types of
music.
• Types of learners: In the classification problems, there are two types of learners:
• Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it receives the
test dataset. In Lazy learner case, classification is done on the basis of the most related data
stored in the training dataset. It takes less time in training but more time for predictions.
• Example: K-NN algorithm, Case-based reasoning
• Eager Learners: Eager Learners develop a classification model based on a training dataset
before receiving a test dataset. Opposite to Lazy learners, Eager Learner takes more time in
learning, and less time in prediction.
• Example: Decision Trees, Naïve Bayes, ANN.
38
Types of Classification?
• Types of classification:
• Supervised: The set of possible classes is known in advance.
• Unsupervised: Set of possible classes is not known. After classification we can try to assign a
name to that class. Unsupervised classification is called clustering.
• Types of Classification algorithms: The Classification algorithms can be further divided into the
Mainly two category:
• Linear Models
• Logistic Regression
• Support Vector Machines
• Non-linear Models
• K-Nearest Neighbours
• Kernel SVM
• Naïve Bayes
• Decision Tree Classification
• Random Forest Classification
39
Evaluation of Classification model?
• Log loss or cross entropy loss:
• It is used for evaluating the performance of a classifier, whose output is a probability value
between the 0 and 1.
• For a good binary Classification model, the value of log loss should be near to 0.
• The value of log loss increases if the predicted value deviates from the actual value.
• The lower log loss represents the higher accuracy of the model.
• Confusion Matrix:
• The confusion matrix provides us a matrix/table as output and describes the performance of the
model.
• It is also known as the error matrix.
• The matrix consists of predictions result in a summarized form, which has a total number of correct
predictions and incorrect predictions.
• AUC-ROC curve:
• ROC curve stands for Receiver Operating Characteristics Curve and AUC stands for Area Under
the Curve.
• It is a graph that shows the performance of the classification model at different thresholds.
• To visualize the performance of the multi-class classification model, we use the AUC-ROC Curve.
• The ROC curve is plotted with TPR and FPR, where TPR (True Positive Rate) on Y-axis and
FPR(False Positive Rate) on X-axis. 40
Module-2 Topic-1: Logistic Regression
Logistic Regression
 Logistic regression is an extension of simple linear regression, where the
dependent variable is dichotomous or binary in nature and we cannot use
simple linear regression.
 Logistic regression is the statistical technique used to predict the relationship
between two or more predictors (independent variables) and a predicted
variable (the dependent variable) where the dependent variable is binary.
 Logistic regression estimates the
probability of an event
occurring, based on a given
dataset of independent variables.
 Since the outcome is a probability,
the dependent variable is
bounded between 0 and 1.
Logistic Regression –An Illustration
41
Logistic Regression
• Logistic regression estimates the probability of a certain event occurring using
the odds ratio by calculating the logarithm of the odds.
• Uses Maximum likelihood estimation (MLE) to transform the probability of
an event occurring into its odds, a nonlinear model.
• Odds ratio is the probability of occurrence of a particular event over the

probability of non occurrence and providing an estimate of the
magnitude of the relationship between binary variables. i.e. probability of
success divided by the probability of failure
42
Logistic Regression
43
Example #1
Example #2
44
What Logistic Regression predicts?
• Probability of Y occurring given known values for X(s).
• In Logistic Regression, the Dependent Variable is transformed into the natural log of the odds.
This is called logit (short for logistic probability unit).
• The probabilities which ranged between 0.0 and 1.0 are transformed into
odds ratios that range between 0 and infinity and approximated as a sigmoid
function applied to a linear combination of input features in the range 0 to 1.
• If the probability for group membership in the modeled category is above
some cut point (the default is 0.50), the subject is predicted to be a member
of the modeled group. Example: Default their payment.
• If the probability is below the cut point, the subject is predicted to be a
member of the other group. Example: No Default their payment.
• For any given case, logistic regression computes the probability that a case
with a particular set of values for the independent variable is a member of
the modeled category.
45
Logistic Regression
46
Assumptions with its explanation for Logistic Regression
• No outliers in the data. An outlier can be identified by analyzing the
independent variables
• No correlation (multi-collinearity) between the independent variables.
Measure how well the algorithm performs using
the weights on functions=
• Where G is the logistic function and to sigmoid
curve, We can see the values of y-axis lie
between 0 and 1 and crosses the axis at 0.5.
• The classes can be divided into positive or
negative. The output comes under the probability
of positive class if it lies between 0 and 1.
• Interpreting the output of hypothesis function as
positive if it is ≥0.5, otherwise negative.
• Loss Function:
47
Advantages and Disadvantages of Logistic Regression
48
Applications of Logistic Regression
49
Logistic Regression-Solved Example#1
A dataset consist of women and men Instagram users with a sample size of
1069. Let the probability of men and women using Instagram
be 𝑃𝑚𝑒𝑛 𝑎𝑛𝑑 𝑃𝑤𝑜𝑚𝑒𝑛 𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑖𝑣𝑒𝑙𝑦. The sample proportion of women who
are Instagram users is given as 61.08%, and the sample proportion for men
is 43.98%. The difference is 0.170951, and the 95% confidence interval is
(0.111429, 0.2292).Establish a logistic regression model specifies the
relationship between p and x. 𝑃0 𝑠𝑢𝑐𝑐𝑒𝑠𝑠
Odds=1− 𝑃 = 𝑓𝑎𝑖𝑙𝑢𝑟𝑒
0
Solution
𝑃𝑤𝑜𝑚𝑒𝑛
Logistic regression equation for women log ( ) = 𝛽0 + 𝛽1
1− 𝑃𝑤𝑜𝑚𝑒𝑛
𝑃𝑚𝑒𝑛
Logistic regression equation for men log ( ) = 𝛽0
1− 𝑃𝑚𝑒𝑛
50
Logistic Regression-Solved Example#1 (Contd.)
Odds for women=1−𝑃0𝑃 = 1−0.6108
0.6108=1.5694
0
𝑃0 0.4398
Odds for men=1− 𝑃 = 1−0.4398=0.7851
0
𝑃𝑤𝑜𝑚𝑒𝑛
Log of Odds for women=log (1− 𝑃 )=log(1.5694)=0.4507=𝛽0 + 𝛽1
𝑤𝑜𝑚𝑒𝑛
𝑃𝑚𝑒𝑛
Log of Odds for men=log ( )=log(0.7851)=-0.2419=𝛽0
1− 𝑃𝑚𝑒𝑛
𝑏0 = −0.2419
Slope 𝑏1 = Log (odds for women)-Log(odds for men)=0.4507-(- 0.2419)=0.6926
Best fit regression equation y= 𝒃𝟎 +𝒃𝟏 𝒙=−0.2419+0.6926𝒙
51
Note: For deciding the 𝒃𝟎 +𝒃𝟏 values in the logistic regression
line Use Scattergrams that has positive correlation such that
the value 𝒃𝟎 can have negative coefficients and 𝒃𝟏 can have
positive coefficients
YY Y Y
X X X
Positive correlation Negative correlation No correlation

Best fit regression line:
y=−𝒃𝟎 +𝒃𝟏 𝒙 + 𝒃𝟐 𝒙 + 𝒃𝟑 𝒙 + ⋯ y = 𝒃𝟎 +𝒃𝟏 𝒙 − 𝒃𝟐 𝒙𝒃𝟑 𝒙 − ⋯ y = −𝒃𝟎 +𝒃𝟏 𝒙 − 𝒃𝟐 𝒙𝒃𝟑 𝒙 + ⋯
52
Module-2 Topic-3: Naïve Bayes-conditional probability
An Example of Bayes Theorem

• Given:
• A doctor knows that meningitis causes stiff neck 50% of the time
• Prior probability of any patient having meningitis is 1/50,000
• Prior probability of any patient having stiff neck is 1/20
• If a patient has stiff neck, what’s the probability he/she has meningitis?
P( S | M ) P( M ) 0.5 1 / 50000
P( M | S ) = = = 0.0002
P( S ) 1 / 20
53
Naïve Bayes Classification model
•Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps
in building the fast machine learning models that can make quick predictions.
•It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.
•Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent
of the occurrence of other features. Such as if the fruit is identified on the bases of color, shape,
and taste, then red, spherical, and sweet fruit is recognized as an apple. Hence each feature
individually contributes to identify that it is an apple without depending on each other.
•Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem
•Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability
of a hypothesis with prior knowledge. It depends on the conditional probability.
•The formula for Bayes' theorem is given as:
•Where,
•P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
•P(B) is Marginal Probability: Probability of Evidence.
•P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
•P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.
54
Naïve Bayes Classification model Solved Example#1
If the weather is sunny, then the Player should play or not?
55
Step-1 Frequency table for the Weather Conditions:
Step-2 Likelihood table weather condition:
56
Step-3 Applying Bayes Theorem
57
Naïve Bayes Classification model
58
Naïve Bayes Classification Model
Advantages of Naïve Bayes Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
datasets.
• It can be used for Binary as well as Multi-class Classifications.
• It performs well in Multi-class predictions as compared to the other Algorithms.
• It is the most popular choice for text classification problems.
Disadvantages of Naïve Bayes Classifier:
• Naive Bayes assumes that all features are independent or unrelated, so it cannot
learn the relationship between features.
Applications of Naïve Bayes Classifier:
• It is used for Credit Scoring.
• It is used in medical data classification.
• It can be used in real-time predictions because Naïve Bayes Classifier is an
eager learner.
• It is used in Text classification such as Spam filtering and Sentiment analysis
59
Summary of Naïve Bayes Classification Model
• Naïve Bayes algorithm is a supervised learning algorithm, which is based
on Bayes theorem and used for solving classification problems.
• Bayes’ rule can be turned into a classifier
• Maximum A Posteriori (MAP) hypothesis estimation incorporates prior
knowledge; Max Likelihood (ML) doesn’t
• Naive Bayes Classifier is a simple but effective Bayesian classifier for
vector data (i.e. data with several attributes) that assumes that attributes
are independent given the class.
• Bayesian classification is a generative approach to classification
• It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
60
Module-2 Topic-5: SVM Classifier
Support Vector Machine (SVM)
• The goal of the SVM algorithm is to create the best line or decision boundary that
can segregate n-dimensional space into classes so that we can easily put the new
data point in the correct category in the future. This best decision boundary is
called a hyperplane.
61
Terminologies in Support Vector Machine (SVM)
• Hyperplane: There can be multiple
lines/decision boundaries to segregate the
classes in n-dimensional space, but we need
to find out the best decision boundary that
helps to classify the data points. This best
boundary is known as the hyperplane of
SVM.
• Dimensions of the hyperplane depend on
the features present in the dataset. 2
features, then hyperplane will be a straight
line.
• Support Vectors: The data points or
vectors that are the closest to the hyperplane
and which affect the position of the
hyperplane are termed as Support Vector
62
Terminologies in Support Vector Machine (SVM)

• The distance between the Positive hyperplane
vectors and the hyperplane
is called as margin.
• Goal of SVM is to
maximize this margin.
• The hyperplane with
Negative
maximum margin is called hyperplane
the optimal hyperplane.
This line
represents the
decision
boundary:
ax + by − c = 0
63
Types of Support Vector Machine (SVM)
• Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset
can be classified into two classes by using a single straight line, then such data is
termed as linearly separable data.
• Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line.
64
Types of Support Vector Machine (SVM)
65
An example for SVM
Data: <xi,yi>, i=1,..,l
xi  Rd
yi  {-1,+1}
Temperature
f(x) =-1
=+1
Humidity Can be expressed as w•x+b=0
= play tennis (remember the equation for a hyperplane
= do not play tennis from algebra!)
Our aim is to find such a hyperplane
All hyperplanes in Rd are parameterize by a vector
f(x)=sign(w•x+b), that
(w) and a constant b.
correctly classify our data. 66
Formulation of Margin
Define the hyperplane H such that:
xi•w+b  +1 when yi =+1 H1
xi•w+b  -1 when yi =-1
H2
H1 and H2 are the planes: d+
H1: xi•w+b = +1
H2: xi•w+b = -1 d-
The points on the planes H1 H
and H2 are the Support
Vectors
d+ = the shortest distance to the closest positive point

d- = the shortest distance to the closest negative point
The margin of a separating hyperplane is d+ + d-.
67
Decision on margin for SVM
68
Maximizing the margin
69
Maximizing the margin
70
Support Vector Machine (SVM)-Illustration
71
Support Vector Machine (SVM)-Solved Example#1
Suppose, we have positively labeled data points
And we have negatively labeled data points
1 By inspection, it should be obvious that there are three

support vectors
72
2 The hyperplane driving SVM is given as
73
4
74
75
Non Linear Support Vector Machine (SVM)-Solved
Example#2
Suppose, we have positively labeled data points
And we have negatively labeled data points
1 Nonlinear mapping from input space into some feature

space
Sub the labelled points in above feature space
76
Example#2
2 There are two support vectors
3 The hyperplane driving SVM is given as
77
Example#2
4 The above equation reduces to
78
Example#2 Module-2 Topic-5: SVM Classifier
79
Support Vector Machine (SVM)-Pros and Cons
Advantages:
•Effective in high dimensional spaces.
•Still effective in cases where number of dimensions is
greater than the number of samples.
•Uses a subset of training points in the decision function
(called support vectors), so it is also memory efficient.
•Versatile: different Kernel functions can be specified for
the decision function. Common kernels are provided, but it
is also possible to specify custom kernels.
Disadvantages:
•If the number of features is much greater than the number
of samples, avoid over-fitting in choosing Kernel
functions and regularization term is crucial.
•SVMs do not directly provide probability estimates, these
are calculated using an expensive five-fold cross-validation
80
Module-2 Topic-2: Decision Tree
Decision tree
• A Decision tree is a flowchart-like tree structure, where each internal node denotes a
test on an attribute, each branch represents an outcome of the test, and each leaf node
(terminal node) holds a class label.
81
Decision tree
• A tree can be “learned” by splitting the
source set into subsets based on an
attribute value test.
• This process is repeated on each derived
subset in a recursive manner called
recursive partitioning.
• The recursion is completed when the
subset at a node all has the same value of
the target variable, or when splitting no
longer adds value to the predictions.
• The construction of a decision tree
classifier does not require any domain
knowledge or parameter setting, and
therefore is appropriate for exploratory
knowledge discovery.
82
Decision tree
• Decision trees can handle high-
dimensional data. In general decision tree
classifier has good accuracy.
• Decision tree induction is a typical
inductive approach to learn knowledge on
classification.
• Decision trees classify instances by
sorting them down the tree from the root
to some leaf node, which provides the
classification of the instance.
• An instance is classified by starting at the
root node of the tree, testing the attribute
specified by this node, then moving down
the tree branch corresponding to the value
of the attribute.
83
Decision tree
Strength:
• Decision trees are able to generate understandable rules.
• Decision trees perform classification without requiring much computation.
• Decision trees are able to handle both continuous and categorical variables.
• Decision trees provide a clear indication of which fields are most important for prediction
or classification.
Disadvantage:
• Decision trees are less appropriate for estimation tasks where the goal is to predict the
value of a continuous attribute.
• Decision trees are prone to errors in classification problems with many classes and a
relatively small number of training examples.
• Decision tree can be computationally expensive to train. The process of growing a decision
tree is computationally expensive. At each node, each candidate splitting field must be
sorted before its best split can be found. In some algorithms, combinations of fields are
used and a search must be made for optimal combining weights. Pruning algorithms can
also be expensive since many candidate sub-trees must be formed and compared.
84
Decision tree Solved Example
Consider whether a dataset based on which we will determine whether to play football or
not.
There are 4 independent variables - Outlook, Temperature, Humidity, and Wind to determine
the dependent variable-whether to play football or not.
85
1 Calculation of Information gain(difference between parent entropy and average weighted entropy)
and Entropy (determines how a decision tree chooses to split data)
Gain (S, Humidity)=0.94-(7/14)0.9852-(7/14)0.5916= 0.1516

86
87
88
4 Initial Decision tree diagram
89
decision tree whether Temperature, Humidity or Wind has higher information gain.
5
90
5
91
5
92
5
93
5
94
decision tree whether Temperature or Humidity has higher information gain.
6
95
decision tree whether humidity is normal or high based on higher information gain.
6
96
decision tree whether wind is strong or not based on higher information gain.
6
97
6
98
6
99
6 Final Decision tree
100
Module-2 Topic-4: Random Forest
Random Forest
• Random forests (RF) are a
combination of tree predictors
such that each tree depends on
the values of a random vector
sampled independently and with
the same distribution for all
trees in the forest.
• The generalization error of a
forest of tree classifiers depends
on the strength of the
individual trees in the forest and
the correlation between them.
• Improvements in classification accuracy have resulted from
growing an ensemble of trees and letting them vote for
the most popular class.
• To grow these ensembles, often random vectors are
generated that govern the growth of each tree in the
ensemble. 101
Random Forest
• Random forest is identified as a collection of
decision trees. Each tree estimates a
classification, and this is called a “vote”. Ideally,
we consider each vote from every tree and chose
the most voted classification (Majority-Voting).
• Random Forest follow the same bagging process
as the decision trees but each time a split is to be
performed, the search for the split variable is
limited to a random subset of m of the p
attributes (variables or features) aka Split-
Attribute Randomization :
• classification trees: m = √p
• regression trees: m = p/3
• Random Forests produce many unique trees.
102
Why Random Forest?

• Random forest algorithm is suitable for both classifications and regression task.
• It gives a higher accuracy through cross validation.
• Standard decision trees often have high variance and low bias
High chance of overfitting (with ‘deep trees’, many nodes)
• With a Random Forest, the bias remains low and the variance is reduced
thus we decrease the chances of overfitting
• Random forest classifier can handle the missing values and maintain the accuracy
of a large proportion of data.
• If there are more trees, it doesn’t allow over-fitting trees in the model.
• It has the ability to work upon a large data set with higher dimensionality.
103
Bagging : Bootstrap Aggregating : Module-2 Topic-4: Random Forest
wisdom of the crowd

1. Sample records with
replacement ("bootstrap"
the training data)
Sampling is the process of selecting a
subset of items from a vast collection of
items.
Bootstrap = Sampling with replacement.
It means a data point in a drawn sample
can reappear in future drawn samples as
well.
2. Fit an overgrown tree to

each resampled data set
3. Average predictions
104
How is a Random Forest created?

• A random forest consists of decision trees.
A decision tree consists of
• decision nodes
the top decision node is called the root node
• terminal nodes or leaf nodes
• A selection of data and features is used for each tree

For every decision tree
• a sample of the training data is used
• a sample of the features (√nfeatures up to 30 – 40%) is used
https://victorzhou.com/blog/intro-to-random-forests/
105
For an individual Decision Tree

• Find the best split in the data.
This is the root node (decision node)
• Find in the first branch for that part of the data again the best split.
That is the first sub node (decision node)
• Continue creating decision nodes until splitting doesn’t improve the situation
The average value of the target variable is assigned to the leaf (terminal node)
• Continue until there are only leaf nodes left or until a minimum value is
reached
• Now the decision tree can be used to do a prediction based on the input
features
106
Determine the best split

• Determine the mean of all datapoints
• Determine the mean square error of all datapoints
• Apply a split. Try at every datapoint
• Calculate the mean of all datapoints on

each side of the split
• Calculate the MSE on each side and take

a weighted average of MSEs on all sides
• Lowest (weighted averaged) MSE is best split

107
Feature Importance
• Feature importance is calculated as
• the reduction in sum of squared errors whenever a variable is chosen to split
• weighted by the probability of reaching that node.
• The node probability

can be calculated by the number of samples that reach the node,
divided by the total number of samples.
• The higher the value the more important the feature.
• However
• the variable importance measures are not reliable in situations where potential predictor variables vary in
their scale of measurement or their number of categories
108
Processing the ensemble of trees called
The Random Forest
• Take a set of variables
• Run them through every decision tree
• Determine a predicted target variable for each of the
trees
• Average the result of all trees
109
How to evaluate the model?

• Split the data in a train set and test set
• Train the model using the trainset.
• Test predictions using the test set.
• Vary the train and test set contents to confirm results
• Compare the test set and the train set

• Determine the Coefficient of Determination for both sets
The proportion of the variance in the dependent variable that is predictable from the
independent variable(s)
How well are observed outcomes replicated by the model
• If R2 is different for both sets, the test or train set is probably biased
• Determine the accuracy of the predictions

by comparing predicted results of the test set with the actual results of the test set
110
Random Forest : Tuning Module-2 Topic-4: Random Forest
• Bagging introduces randomness into the rows of the data.
• Random forest introduces randomness into the rows and columns of the data
• Combined, this provides a more diverse set of trees that almost always lowers our prediction error.
111
Random Forest - Strengths & Weaknesses
112
Applications
• Banking Industry
• Credit Card Fraud Detection
• Customer Segmentation
• Predicting Loan Defaults on LendingClub.com
• Healthcare and Medicine
• Cardiovascular Disease Prediction
• Diabetes Prediction
• Breast Cancer Prediction
• Stock Market
• Stock Market Prediction
• Stock Market Sentiment Analysis
• Bitcoin Price Detection
• E-Commerce
• Product Recommendation
• Price Optimization
• Search Ranking
113
Case Study
• Let us consider the example of the Boston Housing dataset. This is a well-
known dataset of information about different houses in Boston. For each
house, 13 values are known, such as the crime rate in that area,
industrialization value, average age of residents, and so on. Our task is to
train a model to predict the value of a house given these values.
114
Case Study
• Let us consider the example of the Boston Housing dataset.
115
Case Study
• Training the dataset
• We can note that of the 13 original
features, this decision tree has used
only LSTAT (the percentage of the
population in low income groups) and
RM (average number of rooms per
dwelling) to generate a prediction.
• The four leaf nodes show us that this
single tree classifier can produce four
possible outputs: $30k, $44k, $22k and
$14k, even though we are solving a
regression problem and the true
number could be one of many
continuous values.
• This simple decision tree has a mean
absolute error of $3.6k on the training
set, and $3.8k on the test set. This
means that although it is not a
powerful model, it performs similarly
on seen and unseen data, and so it has
generalized well and has not overfit the
training data.
116
Case Study
• Making a random forest ensemble
model
117
Case Study
• Performance of the model
118
Case Study
• Feature Importance in RF
119
Random forest Vs Decision tree
120
Module-2 Summary
Summary
• Logistic regression: Modeling the probability that the response Y belongs to a
particular category, using a logistic function, on the basis of single or multiple
variables.
• Bayes’ theorem for classification: Bayes’ classifier using conditional independence
• Decision trees and random forests: A non-parametric, ‘information-based learning’
approach which is easy to interpret.
• Hyperplane for classification: maximal marigin classifier and SVC.
• Support Vector Machines (SVMs): Extension of SVC to handle ‘non-linear boundaries’
between classes. Uses kernels for computational efficiency. RBF kernel exhibits ‘local
behavior’.
• Random forests are an effective tool in prediction. Forests give results competitive
with boosting and adaptive bagging, yet do not progressively change the training set.
Random inputs and random features produce good results in classification- less so in
regression. For larger data sets, we can gain accuracy by combining random features
with boosting. 121
Topics in Module-5
• Comply with organization’s
current health, safety and security
policies and procedures
• Report any identified breaches in
health, safety, and security
policies and procedures to the
designated person
• Identify and correct any hazards
that they can deal with safely,
competently and within the limits
of their authority
• Report any hazards that they are
not competent to deal with to the
relevant person in line with
organizational procedures and
warn other people who may be
affected.
122
Module-5 Topic-1: Comply with organization’s current health, safety and security policies and procedures
Performance Criteria (PC)

• PC1-Comply with your organization’s current health, safety and security
policies and procedures
• PC2-Report any identified breaches in health, safety, and security policies and
procedures to the designated person
• PC3-Identify and correct any hazards that you can deal with safely, competently
and within the limits of your authority
• PC4-Report any hazards that you are not competent to deal with to the relevant
person and warn other people who may be affected
• PC5-Follow your organization’s emergency procedures promptly, calmly, and
efficiently
• PC6-Identify and recommend opportunities for improving health, safety, and
security to the designated person
• PC7-Complete any health and safety records legibly and accurately.
123
Basic Workplace Safety Guidelines

• Fire safety: Employees should be aware of all emergency exits (including fire
escape routes) of the office building and also the locations of fire extinguishers and
alarms.
• Falls and slips: All things must be arranged properly. There should be proper
lighting in all areas (including stairways). Any spilt liquid, food or other items must
be promptly cleaned to avoid any accidents.
• First Aid: First-aid kits should be kept in places that can be reached quickly and
these locations should be known to all employees.
• These kits should contain all the important items for first aid, for example, items
to deal with cuts, burns, muscle cramps, etc.
• Electrical Safety: Electrical engineers and staffs should carry out routine inspections
of all wiring to make sure there are no damaged or broken wires.
• Employees must be provided instructions about electrical safety such as keeping
water and food items away from electrical equipment.
124
Types of Accidents in Workplace

• The following are some of
commonly occurring accidents in
organizations:
• Trip and fall
• Slip and fall
• Injuries caused due to escalators
• Try to avoid accidents by finding out all potential
or elevators
hazards and eliminating them. One person’s
• Accidents due to falling of goods careless action can harm the safety of many others
in the organization.
• Accidents due to moving objects
125
Types of Emergencies in Workplace

• Categories of emergencies include (but not limited to) the following:
• Medical emergencies, such as heart attack or an expectant mother in labor
• Substance emergencies, such as fire, chemical spills, and explosions
• Structural emergencies, such as loss of power or collapsing of walls
• Security emergencies, such as armed robberies, intruders, and mob attacks or civil
disorder
• Natural disaster emergencies, such as floods and earthquakes
• Keep a list of numbers to call during emergencies. Regularly check that all emergency
handling equipments are in working condition. Ensure that emergency exits are not
obstructed.
126
Hazards
• Hazard can be defined as any source of potential harm or danger to someone or any adverse
health effect produced under certain condition.
• Hazard to an organization include loss of property or equipment while hazard to an individual
involve harm to health or body.
• Examples of potential hazards:
• (i) Materials such as knife or sharp edged nails can cause cuts;
• (ii) Substances such as Benzene can cause fume suffocation. Inflammable substances like
petrol can cause fire;
• (iii) Naked wires or electrodes can result in electric shocks;
• (iv) Condition such as “Wet floor” can cause slippage,
• (v) Objects falling on workers; and (vi) Clothes entangled into rotating objects. 127
Health, safety and security policies

• Safety and security are
not only practical but also
legally required at state
and federal levels,
depending on business
type and regional
location.
• A health and safety
policy is a written
statement by an employer
stating the company’s
commitment for the
protection of the health
and safety of employees
and to the public.
128
Key Components of a Health and Safety Plan

• A reporting system: A simple, clear , well-communicated procedure to report
accidents (including near misses), injuries and illness, as well as potential hazards
in the workplace.
• Training programs: Some aspects may be legal requirements, such as dangerous
goods handling, while other components may deal with the facility, and specific
aspects of the health and safety plan.
• Inspections: Employee and management teams regularly inspect the workplace to
identify changing conditions or activities that may compromise safety.
• Emergency planning: Foreseeable emergencies such as fires and flooding have
developed action plans that are well-communicated with all staff through meetings
and workplace postings.
• Continuous improvement: Management seeks staff input before implementing
changes to the workplace, and regular meetings address not only current health and
safety issues, but also improvements to the health and safety plan.
129
Reasons for Health and Safety Programs or Policies in the Workplace

• There are several reasons why workplaces need a health and safety policy or
program, including:
• to clearly demonstrate management’s full commitment to their employee’s
health and safety;
• to show employees that safety performance and business performance are
compatible;
• to clearly state the company’s safety beliefs, principles, objectives, strategies
and processes to build buy-in through all levels of the company;
• to clearly outline employer and employee accountability and responsibility
for workplace health and safety;
• to comply with the Occupational Health and Safety Act; and
• to set out safe work practices and procedures to be followed to prevent
workplace injuries and illnesses.
130
Workplace Security Procedures
• Hazard Identification and Risk
Assessment Procedures-Hazard
identification and risk assessment procedures
are vital for workplace health and safety,
preventing potential risks from being
overlooked. They familiarize personnel with
hazardous environment duties and risk
assessment steps.
• Hazardous material storage and handling-

• Proper storage and handling of hazardous materials, such as liquid hydrocarbons,
lead, chlorine, and asbestos, is crucial to prevent workplace accidents
• Prevent serious injuries, ensuring their safe release into the environment. 131
• Safely working with electricity and electrical equipment-
• To ensure safety when working with electrical equipment in the workplace,
organizations use procedures such as requiring qualified individuals with verified
licenses, using Personal Protective Equipment, testing equipment before work,
restoring electricity supply, terminating exposed conductors, identifying and
labeling equipment, inspecting and testing portable appliances, removing failed
equipment without a valid tag, and keeping records.
• Fire safety procedures- Fire safety procedures in the workplace are crucial for
protecting employees and visitors.
• Building codes set by local governments dictate minimum fire protection levels.
Organizations should establish fire safety procedures based on building size, use,
location, occupancy, and construction.
• These procedures include a detailed fire risk assessment, alerting methods,
evacuation routes, evacuation plans, firefighting conditions, training, and
firefighting equipment provision.
132
• Housekeeping Procedures-Housekeeping involves maintaining a clean environment in
residential or public settings, including orderly work spaces, clear hallways, proper
waste disposal, secure storage.
• Workplace First Aid and Medical Care- Workplace injuries are a constant source of
worry and expense, and employers are legally and morally responsible for providing
proper first aid and medical care.
• Strict federal and state laws govern how employers must treat victims, and failure
to do so could result in legal liability.
• A trained first aid officer is responsible for managing various medical care, clearly
marking the first aid station, and providing emergency contact information.
• Workers Orientation and Training-Employers face significant pressure from workers
getting injured on the job.
• Training can help reduce work-related injuries by providing clarity on equipment
usage, preventing accidents, and reducing frustration.
• Standard procedures for training include training needs assessments, induction,
equipment introduction training, and regular workshops to build capacity and stay
133
competitive.
• Emergency Evacuation Procedures- Reduce workplace injury risk, ensuring
workplaces to minimize risks to people and property during evacuations and fire
breaks.
• The plan includes a detailed evacuation plan, identification of emergency exits and
stations, a command chain, communication system, notification system for
emergency services, training, drills, and regular reviews.
• Access and Egress -Workplace Access and Egress procedures ensure safety for
workers and first responders during emergencies, identifying parking areas, routes, and
access permits for employees, students, patients, and customers.
• Incident Investigation and Reporting- Accidents can be prevented through proper
investigation and reporting procedures.
• These processes involve reporting incidents to authorities, securing the scene,
gathering information from witnesses, analyzing facts, preparing a written report,
developing corrective actions, and reviewing the process.
• They help resolve workplace safety and liability issues, ensuring the correct
implementation of safety measures.
134
Comply with organization’s Health, safety and security policies
135
Module-5 Topic-2: Report any identified breaches in health, safety, and security policies and procedures
to the designated person
Report any identified breaches in health, safety, and security policies
and procedures to the designated person
• Ensuring laboratory safety
and security requires
effective enforcement by
organizational leaders and
compliance by managers
and workers through
incentives.
• Organizations must identify
and address cultural
barriers to chemical
laboratory safety and
security to ensure the safety
and security of their
employees.
136
Report any identified breaches in health, safety, and security policies and
procedures to the designated person
Initiation and maintenance of an effective compliance system are important to
• give organization leaders useful information about the effectiveness of safety and security
systems and about needs for improvements.
• give designated safety and security personnel authority to collect incident reports and
report incidents to higher authorities for action.
• discern patterns of unsafe behavior and facilities (based on statistics from reports and
inspections), find methods to improve safety and security, and initiates new rules and
regulations to protect workers.
• increase awareness of safety issues in the organization so that a culture of improved safety
and security is encouraged.
• give current information to safety officers so that training of all laboratory workers can be
improved and specific guidance can be given to individual workers.
• give information to laboratory leaders so that they can learn how to use, test, and procure
appropriate personal protective equipment (PPE) and other types of equipment to improve
safety. 137
APPROACHES TO FOSTERING COMPLIANCE

• Setting Organizational Safety Rules, Policies, and Implementation Strategy- Compliance
requires clear rules, policies, and processes agreed upon by organizational leaders, safety
officers, and laboratory managers.
• Dealing with Limited Financial Resources.
• Addressing Climate Control.
• Providing Training and Education.
• Enforcing Consequences of Risky Behavior-Publicizing rules for safe behavior and
penalties for violations is crucial to encourage compliance. Leaders should also reward
those who consistently take safe actions and behave responsibly.
• Taking Special Safety Precautions for women.
• Accommodating Social, Ethnic, and Religious Differences.
• Relieving Time Pressures and Avoiding Shortcuts.
• Looking Out for Coworkers-Survival and well-being in emergencies require specific
guidance and training for workers and students. Proper education on wearing PPE and
proper use is crucial for compliance with laboratory safety rules and preventing accidents.
138
Module-5 Topic-3: Identify and correct any hazards that they can deal with safely, competently and within the limits of
What is an “Incident” and What is an “hazard”? their authority
139
their authority
What is an “Incident” and What is an “hazard”?
140
5 signs the report system is failing their authority
141
Module-5 Topic-4: WORKPLACE HAZARD REPORTING
WORKPLACE HAZARD REPORTING
• Employee training in
hazard recognition
and avoidance is
crucial for preventing
accidents. Hazard
reporting is essential
for employees to
know what to do when
encountering
uncorrected hazards.
• Training can be in-
person, on-the-job, or
a safety meeting, with
annual online or email
reminders for low-
hazard jobs.
142
WORKPLACE HAZARD REPORTING
• What is an unsafe condition that should be reported? -Workplace hazards include rusted
tools, inadequate PPE, unlabeled containers, insufficient lighting, broken machine guards, and a
leaking refrigerator, which can lead to potential incidents causing harm to people, equipment, or
property.
• What is an unsafe act that should be reported?-Unsafe acts, such as careless use of
equipment or inadequate use of personal protective equipment, can lead to incidents causing
harm to people, equipment, or property.
• What should be done if an unsafe condition or act is witnessed in the workplace? -The
hazard reporting procedure in your workplace should be specific and clearly communicate the
steps employees should take, such as filling out a form or communicating verbally with a
supervisor.
• When should a hazard be reported? Any unsafe condition or act should be reported
immediately, or at the next available safe opportunity that the employee has to do so.
• Where can employees find a copy of the Hazard Reporting Procedure? Are hard copies of
procedures kept at headquarters, or is the Safety Manual found online on the company’s
intranet? It’s important that employees know how they can access all company policies and
procedures on their own.
143
WORKPLACE HAZARD REPORTING-Examples
Example-1
Example-2
Example-3
144
WORKPLACE HAZARD REPORTING-Examples
Example-4
Example-5
145
Module-5
Summary
• Performance criteria (7 in total)
• Basic workplace safety guidelines: fire safety, first-aid kit,
electrical safety, etc.
• Types of accidents: trips, slips, injuries/accidents due to
falling/moving items, etc.
• Types of emergencies: medical, structural, natural disaster, etc.
• Hazards: sources of potential harm (notified using signage boards)
146

BCSE352E EDA CAT 2 Mod 1,2,5 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BCSE352E EDA CAT 2 Mod 1,2,5 PDF

Uploaded by

Copyright:

Available Formats

BCSE352E- Essentials of

Regression analysis is a way of

• Regression analysis is a statistical method to model the relationship/correlation between a

• Regression algorithms predicts the discrete or a

Solved Example-Finding the best fit line

Advantages and Disadvantages of Linear

• The objective is to find

Line of best fit Actual Value

Positive correlation Negative correlation No correlation

Null Hypothesis: H0: µ1 = µ2 = µ3

the null hypothesis is rejected.

• Autocorrelation refers to the

• Naïve Bayes-conditional probability

Segregate vast quantities of data

• Odds ratio is the probability of occurrence of a particular event over the

Advantages and Disadvantages of Logistic Regression

Best fit regression equation y= 𝒃𝟎 +𝒃𝟏 𝒙=−0.2419+0.6926𝒙

Positive correlation Negative correlation No correlation

An Example of Bayes Theorem

Step-2 Likelihood table weather condition:

Naïve Bayes Classification model

Terminologies in Support Vector Machine (SVM)

d+ = the shortest distance to the closest positive point

Maximizing the margin

Maximizing the margin

And we have negatively labeled data points

1 By inspection, it should be obvious that there are three

And we have negatively labeled data points

1 Nonlinear mapping from input space into some feature

Sub the labelled points in above feature space

3 The hyperplane driving SVM is given as

Gain (S, Humidity)=0.94-(7/14)0.9852-(7/14)0.5916= 0.1516

Why Random Forest?

wisdom of the crowd

2. Fit an overgrown tree to

How is a Random Forest created?

• A selection of data and features is used for each tree

For an individual Decision Tree

Determine the best split

• Determine the mean square error of all datapoints

• Apply a split. Try at every datapoint

• Calculate the mean of all datapoints on

• Calculate the MSE on each side and take

• Lowest (weighted averaged) MSE is best split

• The node probability

• The higher the value the more important the feature.

How to evaluate the model?

• Compare the test set and the train set

• Determine the accuracy of the predictions

• Bagging introduces randomness into the rows of the data.

Random Forest - Strengths & Weaknesses

Performance Criteria (PC)

Basic Workplace Safety Guidelines

Types of Accidents in Workplace

Types of Emergencies in Workplace

Health, safety and security policies

Key Components of a Health and Safety Plan

Reasons for Health and Safety Programs or Policies in the Workplace

• Hazardous material storage and handling-

Comply with organization’s Health, safety and security policies

APPROACHES TO FOSTERING COMPLIANCE

You might also like