Professional Documents
Culture Documents
INF30036 Lecture5
INF30036 Lecture5
DESCRIPTIVE ANALYTICS
2
DESCRIPTIVE ANALYTICS
3
DESCRIPTIVE ANALYTICS
4
THE ROLE OF THE MEAN, MEDIAN, AND MODE
5
THE ROLE OF THE MEAN, MEDIAN, AND MODE
6
THE ROLE OF THE MEAN, MEDIAN, AND MODE
The third measure of central tendency is the mode. The mode is the
value that occurs most often.
The median is used when the data is skewed; there are a small
number of observations or working with ordinal data.
The mode is rarely used. The only situation in which the mode would
be preferred is when describing categorical or class variables.
7
VARIANCE AND DISTRIBUTION
8
VARIANCE AND DISTRIBUTION
The range does not provide any information about the distribution of
the data. The range is also sensitive to outliers.
9
VARIANCE AND DISTRIBUTION
The more the data is spread out, the greater the range, variance,
and standard deviation.
The more the data is concentrated, the smaller the range, variance,
and standard deviation.
10
THE SHAPE OF THE DISTRIBUTION – Skewness
Skewness measures the extent to which input variables are not
symmetrical. It measures the relative size of the two tails. A
distribution can be left skewed, symmetric, or right skewed. In a left-
skewed distribution, the mean is less than the median, and the
skewness value is negative.
For a normal, symmetrical distribution, the mean and the median are
equal and the skewness value is zero. For a right-skewed
distribution, the mean is greater than the median, the peak is on the
left, and the skewness value is positive.
11
THE SHAPE OF THE DISTRIBUTION – Kurtosis
Kurtosis measures how peaked the curve of the distribution is. In
other words, how sharply the curve rises approaching the center of
the distribution. It measures the amount of probability in the tails.
12
THE SHAPE OF THE DISTRIBUTION – Kurtosis
13
THE SHAPE OF THE DISTRIBUTION
14
THE SHAPE OF THE DISTRIBUTION
15
THE SHAPE OF THE DISTRIBUTION
16
COVARIANCE AND CORRELATION
17
COVARIANCE AND CORRELATION
18
COVARIANCE AND CORRELATION
The covariance value is the product of the two variables and is not
a standardized unit of measurement. So, it is impossible to
measure the degree the variables move together.
19
COVARIANCE AND CORRELATION
20
COVARIANCE AND CORRELATION
21
VARIABLE CLUSTERING
22
PRINCIPAL COMPONENT ANALYSIS
23
HYPOTHESIS TESTING
24
HYPOTHESIS TESTING
In hypothesis testing, there are two types of errors that can occur.
A type I and type II error.
A type I occurs when you reject a true null hypothesis. This is like
finding an innocent person guilty. It is a false alarm. The probability
of a type I error is referred to as alpha, and it is called the level of
significance of the test.
25
ANALYSIS OF VARIANCE (ANOVA)
26
CHI SQUARE
27
CHI SQUARE
Table:
https://www.statology.org/how-to-read-chi-square-distribution-table/
28
FIT STATISTICS
29
FIT STATISTICS
ROC curve is a plot of the true positive rate against the false positive
rate at various possible outcomes. The true positive rate or
sensitivity is the measure of the proportion of actual positives that
were correctly identified.
30
STOCHASTIC MODELS
31
STOCHASTIC MODELS
A stochastic model will use the same set of parameter values and
initial conditions but adds some random variation to the model
resulting in a set of different outputs.
The set of outputs are usually generated through the use of many
simulations with random variations in the inputs.
32
Thank You for your attention.
Q&A