Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Engineering Mathematics 2

STATISTICS
• Topic Outline
Elementary sampling theory for normal
population
Central limit theorem
Statistical inference of means
Proportion and variances
Power and operating characteristics of tests
Chi-square test of goodness of fit
Simple linear regressions

Mathematics for Technology II (CT034-3-2) Statistics Slide 2 (of 59)


Learning Outcomes

 A the end of this topic, You should be able


to:

 Understand the concepts of sampling


theory and Central limit theorem.

 Perform statistical inference using


estimation and hypothesis testing.

Mathematics for Technology II (CT034-3-2) Statistics Slide 3 (of 59)


Key Terms you must be able to use

If you have mastered this topic, you should be


able to use the following terms correctly in your
assignments and final exam:

Mathematics for Technology II (CT034-3-2) Statistics Slide 4 (of 59)


Central Limit Theorem

If

X 1 , X 2, X 3 ,... X n ,...is a sequence of independent random variables


X i ) µi var(=
with E (= X i ) σ=
2
i,i 1, 2,3, 4,5, 6........n and if
S n = X 1 + X 2 + X 3 + ... + X n ,...then under certain general conditions
n n
µ = ∑ µi and variance σ 2 = ∑ σ i2 as n tends to infinity.
i =1 i =1

Mathematics for Technology II (CT034-3-2) Statistics


Sampling Theory- Population or
universe
An aggregate of objects (animate/ inanimate)
under study is called population or universe. It
is thus a collection of individuals or of their
attributes or of results of operations which
can be numerically specified.

A universe containing a finite number of


individuals or members is called finite
universe . For example , the universe of the
weight of the students in a particular class.
Mathematics for Technology II (CT034-3-2) Statistics
Sampling Theory

A universe of concrete objects is an existent


universe .

The collection of all possible ways in which a


specified event can happen is called a
hypothetical universe.

The universe of heads and tails obtained by


tossing a coin in an infinite number of times.

Mathematics for Technology II (CT034-3-2) Statistics


Sampling Theory

The statistician is often confronted with the


problem of discussing universe of which he
cannot examine every member i.e. of which
complete enumeration is impracticable.
For example if we want to have an idea of the
average per capital income of the people of
India ,enumeration of every earning individual
in the country is a very difficult task.

Mathematics for Technology II (CT034-3-2) Statistics


Sampling Theory

Naturally the question arises:


What can be said about a universe of which
we can examine only a limited number of
members?
The question is the origin of the THEORY OF
SAMPLING.

Mathematics for Technology II (CT034-3-2) Statistics


Sampling Theory

Sample: A finite subset is called a sample. A


sample is thus a small portion of the universe.

Sample size: The number of individuals in a


sample is called the sample size.

Sampling : The process of selecting a sample


from a universe is called a sampling.

Mathematics for Technology II (CT034-3-2) Statistics


Sampling Theory

The theory of sampling is a study of


relationship existing between a population
and samples drawn from the population.
The fundamental object of sampling is to get
the information as possible of the whole
universe by examining only a part of it.
An attempt of is thus made through sampling
to give the maximum information about the
parent universe with the minimum effort .
Mathematics for Technology II (CT034-3-2) Statistics
Sampling Theory

Sampling is quite often used in our daily life.

For example , in a shop we asses the quality of


sugar , rice or any other product only by
taking it from the bag and then decide
whether to purchase it or not .

Mathematics for Technology II (CT034-3-2) Statistics


Sampling Theory

A house wife normally tests the cooked


products to find if they are properly cooked
and contain the proper quality of salt or sugar,
by taking a spoonful of it.

Mathematics for Technology II (CT034-3-2) Statistics


Random Sampling

The selection of an individual from the


universe in such a way that each individual of
the universe has the same chance of being
selected is called Random Sampling
A sample obtained by the random sampling is
called a random sample
The simplest method which is normally used
for random sampling is the lottery system

Mathematics for Technology II (CT034-3-2) Statistics


Random Sampling

Sampling of attributes: The sampling of


attributes may be regarded as the drawing of
samples from a universe whose numbers
possess the attributes A or B

The universe is thus divided into two mutually


exclusive and collectively exhaustive classes-
one class possessing the attribute A and the
other class not possessing the attribute A
Mathematics for Technology II (CT034-3-2) Statistics
Random Sampling

The presence of a particular attribute in


sampled unit may be termed as success and
its absence is failure.

Mathematics for Technology II (CT034-3-2) Statistics


Random Sampling

By simple sampling we mean random


sampling in which each event has the same
probability of success and the probability of
event is independent of the success or failure
of events in the preceding trials.
Thus the simple sampling is a special case of
random sampling in which the trials are
independent and probability of success is
constant.
Mathematics for Technology II (CT034-3-2) Statistics
Random Sampling

For example , counting the number of


successes in the throwing of a dice or tossing
of a coin is a case of simple sampling since the
probability of getting heads with the a coin is
unaffected by the previous trials and remains
constant irrespective of the number of trials
made provided the coin remains unbiased.

Mathematics for Technology II (CT034-3-2) Statistics


Random Sampling

Sampling , through random , need not be


simple.

Random sampling from an infinite universe


always simple but random sampling from a
finite universe may or may not be simple
according as the members drawn are replaced
or not.

Mathematics for Technology II (CT034-3-2) Statistics


Random Sampling

The conditions of simple sampling viz,


constant probability p, and independent
events satisfy the basic assumptions of the
Binomial Distribution.

The binomial probability distribution thus


determined is called the sampling distribution
of the number of successes in the sample.
Mathematics for Technology II (CT034-3-2) Statistics
Student’s ‘t’ test

Gosset , who wrote under the pen name of student , derived


a theoretical distribution which has come to be known as
Student’s ‘t’ distribution.
The quantity ‘t’ is defined

as
x− µ
t=
s
n
n = number of the observations in the sample
Where x− is the sample mean
………… µ is the population mean
………….S is the sample standard deviation.

Mathematics for Technology II (CT034-3-2) Statistics


Application of Student’s ‘t’ test

To test the significance of a sample mean

To test the significance of the difference


between two sample means

To test the significance of the coefficient of


correlation.

Mathematics for Technology II (CT034-3-2) Statistics


SNEDECOR’S F-test-For equality of
Population Variances
=xi (i 1,=
Let
2,...n) and y j ( j 1, 2,...n)be the values of two independent random samples
drawn from the normal populations with the same var iance.

− −
Let x and y be the sample means.
2 2
  −
  −
n1 n2
1 1
Let S =
x
2
∑  i
n1 − 1 i 1 =

x − x 

& S =
2
y ∑  j
n2 − 1 i 1 
y − y 

Then S x2
=F if S 2
x > S 2
y
S y2
or
S y2
=F 2
if S 2
y > S 2
x
Sx
Mathematics for Technology II (CT034-3-2) Statistics
SNEDECOR’S F-test-For equality
of Population Variances
Taking the hypothesis that the two samples have
drawn from normal populations with the same
variance , we compare the calculated value F with its
table value.
If the calculated value of F > the table value of F with
5% level of significance , the ratio is significant and
the hypothesis may be rejected
If the calculated value of F < the table value of F with
5% level of significance ,the hypothesis may be
rejected
Mathematics for Technology II (CT034-3-2) Statistics
Chi Square Test

When a coin is tossed 200 times , the theoretical


considerations lead us to expect 100 heads and 100 tails.
But in practice the results are rarely achieved.
The quantity χ a Greek letter , pronounced as CHI SQUARE
2

describes the magnitude of discrepancy between the theory


and observation.
Then
 ( − ) 
2
n
O E
χ = ∑
2 i i

i =1  E 
 i

=
Oi (i 1,= 2,...n) Obseved frequency
=
Ei (i 1,=
2,...n) Expected frequency

Mathematics for Technology II (CT034-3-2) Statistics


Chi Square Test

Degrees of freedom = n-1


Chi square test is one of the simplest and
most general test known.
It is applicable to a very large number of
problems in practice which can be summed up
under the following heads:
Chi square test of goodness of fit
Chi square test of independence of attributes.

Mathematics for Technology II (CT034-3-2) Statistics


Chi Square Test

Conditions for applying chi square test:


Following are the conditions which should be
satisfied before Chi square test is applied .
N , the total number of frequencies should be
large.
It is difficult to say what constitutes largeness,
but as an arbitrary figure , we may say that N
should be at least 50, however , few the cells.

Mathematics for Technology II (CT034-3-2) Statistics


Chi Square Test

No theoretical cell frequency should be small.

5 should be regarded as the very minimum


and 10 is better.

It is important to remember that the number


of degrees of freedom is determined with the
number of classes after regrouping

Mathematics for Technology II (CT034-3-2) Statistics


Regression

Regression is the estimation or prediction of


unknown values of one variable from known
values of another variable.
After establishing the fact of correlation
between two variables , it is natural curiosity
to know the extent to which one variable
varies in response to a given variation in
another variable

Mathematics for Technology II (CT034-3-2) Statistics


Regression

To know about the nature of relationship


between the two variables.

Regression measures the nature and extent


of correlation.

Mathematics for Technology II (CT034-3-2) Statistics


Linear Regression

If two variables x and y are correlated i.e.


there exists an association or relationship
between them, then the scatter diagram will
be more or less concentrated round a curve.
This curve is called the curve of regression.
The relation ship is said to be expressed by
means of curvilinear regression.

Mathematics for Technology II (CT034-3-2) Statistics


Linear Regression

When the curve is a straight line , it is called a


line of regression and the regression is said to
be linear
A line of regression is the straight line which
gives the best fit in the least square sense to
the given frequency.

Mathematics for Technology II (CT034-3-2) Statistics


Regression Lines

If the line of regression is so chosen that the


sum of squares of deviation parallel to the axis
of y is minimized , it is called the line of
regression of y on x

It gives the best estimate of y for any given


values of x

Mathematics for Technology II (CT034-3-2) Statistics


Regression Lines

If the line of regression is so chosen that the


sum of squares of deviations parallel to the
axis of x is minimized , it is called the line of
regression of x on y

It gives the best estimate of x for any given


values of y

Mathematics for Technology II (CT034-3-2) Statistics


Lines of Regression

Y on X
− σy −
y − y = byx = r ( x − x)
σx

X on Y
− σx −
x − x = bxy = r ( y − y)
σy

Mathematics for Technology II (CT034-3-2) Statistics


Regression Lines

If r = 0 , the two lines of regression becomes


− −
=y y=
,x x

The two straight lines are parallel to X and Y axes respectively


and passing through the means

They are mutually perpendicular

If r = ±1 , the two lines of regression will coincide.

Mathematics for Technology II (CT034-3-2) Statistics


Properties of Regression coefficients

Correlation coefficients is the geometric mean


between the regression coefficients.

If one of the regression coefficients is greater


than unity, the other must be less than unity.

Arithmetic mean of regression coefficients is


greater than the correlation coefficient.

Mathematics for Technology II (CT034-3-2) Statistics


Properties of Regression coefficients

Regression coefficients are independent of


change of origin but not of scale.

The correlation coefficient and the two


regression coefficients have the same sign

Mathematics for Technology II (CT034-3-2) Statistics


Angle between two Regression lines

Angle between two regression lines

 1 − r   σ xσ y 
2
tan θ =    2 2 
 r  xσ + σ 
y 

Mathematics for Technology II (CT034-3-2) Statistics


Quick Review Question

Mathematics for Technology II (CT034-3-2) Statistics Slide 56 (of 59)


Summary of Main Teaching Points

• Topic Outline
Central Limit theorem
Testing of hypothesis
Correlation
Regression

Mathematics for Technology II (CT034-3-2) Statistics Slide 57 (of 59)


Question and Answer Session

Q&A

Mathematics for Technology II (CT034-3-2) Statistics Slide 58 (of 59)


Next Session

Matrices and Determinants

Mathematics for Technology II (CT034-3-2) Statistics Slide 59 (of 59)

You might also like