Week5 PDF

MECHANICAL
ENGINEERING SYSTEMS
LABORATORY
Group 02
Asst. Prof. Dr. E. İlhan KONUKSEVEN

STATISTICAL TREATMENT OF
EXPERIMENTAL DATA
DISCRETE FREQUENCY DISTRIBUTIONS
Assume that a total of n=10 measurements, xi (i=1,…,10)

are made as:
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
14 16 13 19 18 14 14 15 18 15
Note that the span of measurements is 6, ranging from

13 to 19.
FREQUENCY F( nj )
IS THE NUMBER OF OCCURRENCE OF

THE jth MEASUREMENT VALUE
In this example, frequencies are:
j 1 2 3 4 5 6 7
value 13 14 15 16 17 18 19
nj 1 3 2 1 0 2 1
RELATIVE FREQUENCY fj
IS THE RELATIVE VALUES OF NUMBER OF OCCURRENCES

WITH RESPECT TO TOTAL NUMBER OF OCCURRENCES
nj m
f
m
fj  j 1 & nnj
n j1 j1
THERE ARE 7 GROUPS ie m = 7

j 1 2 3 4 5 6 7
value 13 14 15 16 17 18 19
fj 0.1 0.3 0.2 0.1 0.0 0.2 0.1
j 1 2 3 4 5 6 7
value 13 14 15 16 17 18 19
nj 1 3 2 1 0 2 1
Frequency Graph: These measurements may be shown

graphically on a histogram called “Frequency Graph”
as follows:
Frequency Relative
nj Frequency
4 0.4
fj
3 0.3
x7
2 0.2
x6 x10 x9
1 0.1
x3 x1 x8 x2 x5 x4
0 0.0
13 14 15 16 17 18 19
MEASURES OF CENTRAL TENDENCY
x
ARITHMETIC MEAN (Average)

n
1
x
n
i1
xi
IT PROVIDES THE BEST ESTIMATE OF AN UNBIASED

DISTRIBUTION OF DATA
x is the most commonly used measure of central tendency because

it usually provides the “best estimate” of the most typical value in
the distribution of data.
x =15.6 for the last example
BIAS:
In statistics, bias is systematic favoritism (tendency to make
systematic errors) present in data collection, analysis or reporting of
quantitative search
MEDIAN
IT IS THE VALUE AT THE MIDDLE POSITION OF A

DISTRIBUTION OF DATA
IT IS USUALLY USED WHEN THE DISTRIBUTION

IS BIASED
Median is the middle value of the given numbers or distribution
in their ascending order. Median is the average value of the two
middle elements when the size of the distribution is even.
13, 14, 14, 14, 15, 15, 16, 18, 18, 19
(It is 15 for the last example)
MODE
IT IS THE VALUE HAVING THE HIGHEST

FREQUENCY
IN THE SAMPLE DISTRIBUTION
( It is not very meaningful unless n is too large )

GEOMETRIC MEAN (Log - Mean)
1/n
 n

x g    x i 
 i1 
1 n
log( x g )   log( x i )
n i 1
IT IS IMPORTANT WHEN DEALING WITH

RATIOS OR PERCENTAGES
(It is 15.5 for the last example)

HARMONIC MEAN
n
x h  n  (1 / x i )
i1
(It is 15.4 for the last example)

QUADRATIC MEAN
(ROOT - MEAN - SQUARE )
1 n 2
x rms  
n i1
xi
It can be considered as the second moment of a set of

data about its origin. (It is 15.7 for the last example)
MEASURES OF DISPERSION OF DATA
VARIANCE
(MEAN SQUARE DEVIATION )
n
1
VAR     ( x i  x )
2 2
n i 1
It is 3.84 for the last example

STANDARD DEVIATION
1 n
 
n i1
( x i  x ) 2
 ( x 2
i )  ( x ) 2

RANGE
IT IS THE DIFFERENCE BETWEEN
THE LARGEST AND SMALLEST
VALUES OF THE ENTIRE SET OF
DATA

AVERAGE DEVIATION
n
1
A.D . 
n
 i1
x  x
i

UNBIASED ESTIMATES
If a “random sample” is drawn from a “population”
(or “universe”),
P o p u la t io n o r U n iv e r s e
M ean: 
S .D .: 
R a n d o m S a m p le (x 1, x 2, … , x n)
UNBIASED ESTIMATES
A) THE SAMPLE MEAN
Population or Universe
x IS THE BEST Mean: 
S.D.: 
AVAILABLE ESTIMATE
OF THE UNKNOWN
Random Sample (x1, x2, … , xn)
MEAN OF THE
UNIVERSE 
UNBIASED ESTIMATES
A) THE BEST Population or Universe

Mean: 
AVAILABLE ESTIMATE S.D.: 
OF THE UNKNOWN
Random Sample (x1, x2, … , xn)
STANDARD DEVIATION
OF THE UNIVERSE  IS GIVEN BY
s
1 n

n  1 i 1
( x i  x ) 2

n
n 1
( x 
2
i )  ( x ) 2

s
1

n
n  1 i 1
( x i  x) 
2 n
n 1

( x i )  ( x)
2 2

THE USE OF THIS EXPRESSION BECOMES
IMPORTANT ESPECIALLY WHEN n IS SMALL
FOR LARGE VALUES OF n s   sample
HOWEVER, S > sample ALWAYS
(For the last example, s=2.07)

xj C) IF MORE THAN ONE ( SAY m ) EQUAL-SIZED RANDOM
SAMPLES ARE DRAWN FROM THE SAME UNIVERSE, THEN
THEIR RESPECTIVE MEANS AND STANDARD DEVIATIONS ARE
EXPECTED TO BE EQUAL TO EACH OTHER
x 1  x 2  .....  x m Population or Universe
s 1  s 2  .....  s m Sample 1
Sample 2 Sample m
It is also possible to treat xj and sj as statistical quantities and

define their standard deviations
STANDARD ERROR OF THE MEAN
s
sx 
n
THIS QUANTITY REPRESENTS THE STANDARD

DEVIATION OF
x FROM 
( For the last example, s x = 0.655 )

STANDARD ERROR OF THE
STANDARD DEVIATION
s sx
ss  
2n 2
THIS QUANTITY REPRESENTS THE STANDARD
DEVIATION OF s FROM 
For the last example, ss=0.463

CONTINUOUS DISTRIBUTIONS
IN ACTUAL EXPERIMENTS VALUES WILL BE LESS

DISCRETE
23.26 , 25.12 , etc
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
14 16 13 19 18 14 14 15 18 15
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
14.21 16.36 13.16 18.74 17.59 14.43 14.02 14.77 18.01 15.16
IF WE HAD A SET OF 100 DATA VALUES SUCH AS

23.26 , 25.12 ... , etc THEN THE FREQUENCY GRAPH
WOULD PROBABLY HAVE VERY FEW VALUES THAT
WERE THE SAME
Relative Frequency, fj
0.2
0.1
0.0
13 14 15 16 17 18 19
THE ONLY APPARENT MEANINGFUL QUANTITY

APPEARS TO BE THE DENSITY OF THE “DOTS”
LET US DIVIDE THE
DATA BY
INCREMENTS
16
NOW LET US COUNT
HOW MANY DATA
POINTS ARE BETWEEN
22.51 AND 23.50
16
If all intervals of interest are plotted, the result would
be a bar graph as:
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
14 16 13 19 18 14 14 15 18 15
Frequency Relative
nj Frequency
4 0.4
fj
3 0.3
x7
2 0.2
x6 x10 x9
1 0.1
x3 x1 x8 x2 x5 x4
0 0.0
13 14 15 16 17 18 19
IF MORE MEASUREMENTS WITH A MORE
ACCURATE DEVICE WERE TAKEN
x1 x2 x3 x4 x5 x6 x7 x8 x9 x 10
1 4 .2 1 1 6 .3 6 1 3 .1 6 1 8 .7 4 1 7 .5 9 1 4 .4 3 1 4 .0 2 1 4 .7 7 1 8 .0 1 1 5 .1 6
R e la t iv e F r e q u e n c y , f j
0 .2
0 .1
0 .0
13 14 15 16 17 18 19
AND IF THE DATA WERE INCREASED
R e la tiv e F re q u e n c y , f j
0 .1 0
0 .0 5
0 .0 0
13 14 15 16 17 18 19
Relative Frequency, f j
0.10
0.05
0.00
13 14 15 16 17 18 19
When all intervals of interest are plotted, the result would be a
bar graph as:
R elative F requ en cy, f j
0 .0 8 E n v elop e
0 .0 6
0 .0 4
0 .0 2
0 .0 0
13 14 15 16 17 18 19
THE INTERVAL MUST BE CHOSEN
* LARGE ENOUGH TO BE
MEANINGFUL
* SMALL ENOUGH
TO GIVE DETAIL
N = 5 log n for large n

N = 1 + 3.3 log n for n<25 Sturges rule
where n is the num ber of data points and N is
suggested num ber of class intervals.

Week5 PDF

Uploaded by

Copyright:

Available Formats

You might also like

Week5 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week5 PDF

Uploaded by

Copyright:

Available Formats

MECHANICAL

Asst. Prof. Dr. E. İlhan KONUKSEVEN

Assume that a total of n=10 measurements, xi (i=1,…,10)

Note that the span of measurements is 6, ranging from

IS THE NUMBER OF OCCURRENCE OF

In this example, frequencies are:

IS THE RELATIVE VALUES OF NUMBER OF OCCURRENCES

THERE ARE 7 GROUPS ie m = 7

Frequency Graph: These measurements may be shown

ARITHMETIC MEAN (Average)

IT PROVIDES THE BEST ESTIMATE OF AN UNBIASED

x is the most commonly used measure of central tendency because

x =15.6 for the last example

IT IS THE VALUE AT THE MIDDLE POSITION OF A

IT IS USUALLY USED WHEN THE DISTRIBUTION

IT IS THE VALUE HAVING THE HIGHEST

( It is not very meaningful unless n is too large )

(It is 14 for the last example)

IT IS IMPORTANT WHEN DEALING WITH

(It is 15.5 for the last example)

(It is 15.4 for the last example)

(ROOT - MEAN - SQUARE )

It can be considered as the second moment of a set of

It is 3.84 for the last example

It is 1.96 for the last example

(It is 6 for the last example)

It is 1.72 for the last example

A) THE BEST Population or Universe

FOR LARGE VALUES OF n s   sample

HOWEVER, S > sample ALWAYS

(For the last example, s=2.07)

x 1  x 2  .....  x m Population or Universe

It is also possible to treat xj and sj as statistical quantities and

THIS QUANTITY REPRESENTS THE STANDARD

( For the last example, s x = 0.655 )

For the last example, ss=0.463

IN ACTUAL EXPERIMENTS VALUES WILL BE LESS

23.26 , 25.12 , etc

IF WE HAD A SET OF 100 DATA VALUES SUCH AS

THE ONLY APPARENT MEANINGFUL QUANTITY

R elative F requ en cy, f j

N = 5 log n for large n

You might also like