Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

Basic Statistics

What is STATISTICS?
A collection of methods for:
 Planning Experiments
 Obtaining…
Organizing…
Summarizing…
Presenting…
Analyzing…
Interpreting…
and Drawing Conclusions from…

DATA
Fundamental of Statistics
• Two types of statistical parameters and
procedures
- Descriptive statistics
*used to describe, organize, summarize, or
visually display data
*example: mean, range, std. deviation, graphs
- Inferential statistics
*used to make prediction and decisions
*based on probability
*examples: Test for outliers, confidence
intervals, analysis of variance (ANOVA)
So What are we looking for?
 Where, in a group of some measurements,
is a point that best represents the set of
measurements?
 Do the measurements cluster about their
central point or do they spread out around
it?
Central Tendency
Measure of Central Tendency:
 A single summary score that best describes the
central location of an entire distribution of scores.
 The typical score.
 The center of the distribution.
 One distribution can have multiple locations where
scores cluster.
 Must decide which measure is best for a given situation.
Central Tendency
Measures of Central Tendency:
 Mean
 The sum of all scores divided by the number of
scores.
 Median
 The value that divides the distribution in half
when observations are ordered.
 Mode
 The most frequent score.
Mean
Is the balance point of a distribution.
The sum of negative deviations from the
mean exactly equals the sum of positive
deviations from the mean.
Mean “sigma”, the sum of X, add up
all scores

Population
X
“mu”  “N”, the total number of
N scores in a population

Sample “sigma”, the sum of X, add up


all scores
X
“X bar” X 
n
“n”, the total number
of scores in a sample
Central Tendency- Mean
Example:
Restaurant rates per plate in a city:
52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283,
303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480,
643, 693, 732, 749, 750, 791, 891

Mean restaurant rate:

X
 X 
n
13005
 X   371.60
35
 Mean Restaurant rate: Rs. 371.60
Which average?
Each measure contains a different kind of
information.

 For example, all three measures are useful for


summarizing the distribution

 Reporting only one measure of central tendency


might be misleading and perhaps reflect a bias.
Measures of Dispersion
A single summary figure that describes
the spread of observations within a
distribution.
Measures of Dispersion
Standard Deviation
 Measure of the average amount by which
observations deviate from the mean.
Range
 Difference between the smallest and
largest observations.
Inter Quartile Range
 Difference between Q3 and Q1
• Standard Deviation
Square root of the quotient obtained by
dividing the sum of the squares of deviations
of the observations from their mean by one
less than the number of observations
n
Mean = xi/n
i=1

n
Standard (s)=  (xi – x) /n-1
2
Deviation i=1
Mean and Standard Deviation
Using the mean and standard deviation
together:

 Is an efficient way to describe a distribution with


just two numbers.

 Allows a direct comparison between distributions


that are on different scales.
To Calculate Standard Deviation
1. Get average Reading,
X X - average (X-average)^2
2. Deviations from 1,000,000,043 -7 49
average 1,000,000,055 5 25
1,000,000,055 5 25
3. Square those 1,000,000,051 1 1
Step 2 Step 3
deviations 1,000,000,058 8 64
1,000,000,043 -7 49
4. Sum the squares 1,000,000,045 -5 25
5. Divide by (n-1) 1,000,000,045 -5 25
6. Take sqr root 1,000,000,057 7 49
1,000,000,048 -2 4
=average() Step 1 1,000,000,050 Sum()= Step 4 316
=Count(), n 10
Excel Stdev Fnc:5.93 n-1 Step 5 9
Nominal: Sum / (n-1) 35.11
Step 6
1.00E+09 Stdev= sqrt[Sum/(n-1)] 5.93
Central Limit Theorem
Given certain conditions, the arithmetic mean (µ)
of a sufficiently large number of independent
observations of measurements, each with a well-
defined expected value and well-defined standard
deviation (σ), will be approximately Normally
distributed commonly known as a "bell curve".
Normal distribution or bell
curve
The Normal Distribution has:

i) mean = median = mode

ii) symmetry about the center

iii) 50% of values less than the


mean and 50% greater than the
mean
Normal Distribution
Normal Distribution
In the “normal” distribution-
• range mean ± one standard deviation will
encompass 68.27% of all the readings taken
• range mean ± two standard deviations will
encompass 95.44% of all the readings taken
• range mean ±three standard deviations will
encompass 99.74% of all the readings taken.

The probability of a reading exceeding three


standard deviations when a process is in control
is small,i.e.,0.26%.
Rectangular Probability Distribution
Rectangular Probability Distribution

If the limits can be determined but there is no


knowledge of behavior within the limits and the
value of measurand is equally likely to lie
anywhere within the limits, then the distribution
of uncertainty is assumed as Rectangular
Distribution.
Triangular Probability Distribution
Triangular Probability Distribution

When it is known that most of the values are


more likely to be near the centre of the
distribution, rather than at the extremes limits,
the Triangular Distribution is used.
U-Shaped probability Distribution
U-Shaped probability Distribution

When the values at the extreme limits are most


likely to occur but the values at the mean is least
likely, U-shaped distribution is used.
Trapezoidal Distribution Function

μ–a μ μ+a
Trapezoidal Distribution Function

In some of the cases, the values are more likely


to be near the mid point than those near the
bound. In this case, we have the distribution
with equal sloping sides with base width 2a, and
a top of width 2b, where b/a=β, where 0≤β≤1.
The uncertainty with the distribution is

u (xi) = √[a2(1+β2)/6]
Trapezoidal Distribution Function

Depending upon the values of β=b/a, the case


of Rectangular and Triangular distributions
becomes special cases of Trapezoidal
distribution.

For β = 1, it is Rectangular Distribution


For β = 0, it is Triangular Distribution
All the above formulae given for the various
distributions are used when the limits are
symmetric ±a. When bounds are asymmetric, it
may be appropriate to apply correction to the
estimate and calculate the new symmetrical
bounds.
Thanks

You might also like