Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

DATA SCIENCE-06

(19/01/2024)
By- Ms. Pallavi Mishra
(Faculty Associate)
Topics to be covered…
• Measures of dispersion : Range,Variance, Standard Deviation
make
inferences
Overview about
population
based on a
Statistics is the science of collecting, organizing, summarizing, analysing sample of
information to draw conclusion or answer questions. data
Types of Statistics
Describing and
summarizing
population or Descriptive Inferential
sample Statistics Statistics

Central Measures of Five number Cross Decision


Dispersion Distribution Histogram
Tendency summary Tabulations Tree

Mean Range Minimum Normal Box plot Correlation Plot Scatter Plot
Median Variance 𝑄1 Uniform
Standard Bubble
𝑄2 (Median) Line chart Bar Chart Pie Chart
Mode Deviation Skewness Chart
𝑄3
Kurtosis
Maximum
Summary Analytics (Descriptive Statistics)
Types of descriptive statistics
There are 3 main types of descriptive statistics:
•The distribution concerns the frequency of each value.
•The central tendency concerns the averages of the values.
•The variability or dispersion concerns how spread out the values are.

Normal

Uniform

Skewness

Kurtosis
Descriptive Statistics

• The variation is the amount of dispersion, or


scattering, of values away from a central value.
Range

• The range is the


difference between
highest and lowest
values within the set of
numbers.
outlier
Variance
• A simple measure of variation around the mean might take the
difference between each value and the mean and then sum these
differences.
• Variance of population is denoted by “σ 2 ” and sample variance is
denoted by s2 .
Standard Deviation
• The standard deviation is a measure of the spread of a set of data
around its mean. It indicates how much the individual values in a set
of data deviate from the mean.

• The sample standard deviation is the square root of the sum of the
squared differences around the mean divided by the sample size minus
one.
σ𝒏 ഥ 𝟐
𝒊=𝟏(𝒙𝒊 −𝑿)
𝑺 = 𝑺𝟐 = √
𝒏−𝟏
Variance: Dispersion(Spreading of data)
Standard Deviation
Significance of Standard deviation and
Variance
Conclusion: Variance and Standard Deviation
• If the variance and standard deviation values are close to zero, it
means that the data points in the dataset are clustered close together
and have a small spread.

• This can indicate a lack of variability in the data, which could be due
to a limited range of values or to the data being homogeneous in
nature.
Example for practice
• Suppose consider the following data: 31, 29, 52, 44, 40, 35, 39, 43, 39,
44. find variance and standard deviation.
Problems on Variance and
Standard Deviation
Questions:
1. What is the smallest possible value for variance and standard
deviation?

2. What are units of variance and standard deviation?


3.
4. Imagine you have two sets of exam scores: one with a mean of 70 and
a standard deviation of 5, and the other with a mean of 70 and a
standard deviation of 15. How might you interpret the differences in
variability between these two sets of scores?
5. A low standard deviation indicates more uniformity and predictability,
while, high standard deviation suggests that the data points are more
spread out from the mean.( True/False)
6. If the standard deviation of a dataset is zero, what does it indicate
about the values in the dataset?

You might also like