Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

Chapter Five

Sampling and Sampling Distribution

Introduction
The main objective of statistical analysis is to know the actual value of different parameters
of a given population. One way of knowing the parameters can be through conducting census.
A census means complete enumeration of the entire population and determining the value of
parameter of interest. However, in most cases census is not feasible from practical point of
view due to cost, time, labor and other constraints. Alternative to census one can use
sampling approach to determine the same thing. Sampling is the process of selecting a sample
from a population. That is a random samples of a given size are taken from the population
and these samples characteristics are properly analyzed to infer the characteristics of the
population from the sample taken.
When random samples of a certain size are repeatedly drawn from a given population to
determine sample statistic, the computed value of the sample statistic (e.g. Sample mean) will
differ from sample to sample. Since the sample statistic based on a sample of certain size,
they are a random variable and each follow a probability distribution of its own called
sampling distribution
Sampling distribution has its own properties upon which rules for generalizing about
population based up on sample drawn from a population. In this chapter, we will study the
properties of some statistics a bit in depth and about widely used sampling distribution such

as t, F, and  distribution.
2

Objective of the chapter


After this chapter, the student will able to:
- Familiar with the concepts like statistic, parameters, random variable
- define sampling distribution
- identify the distribution of different sample statistic(sample mean, variance)
- describe the properties of the sampling distribution of sample statistic

State some of the properties of standard probability distributions: t, F, and 


2
-
distribution.
5.1. Statistic, Random Variable, Parameter and Sampling Distribution
Before we start to study the properties of sampling distribution and its application, let us get
familiar with the following concepts.

Random variable
A variable is a random variable if its value determined by a random experiment. If variable X
is said to be random variable, it represents a phenomena of interest in which the observed
outcomes of an activity is entirely by chance. It is unpredictable and varies or changes
depending up on particular outcome of experiment measured. For example, suppose you toss
a die and measure X as the numbers observed on the upper face. The variable X can take on
any of six values 1, 2, 3, 4, 5 and 6 depending on random outcome of the experiment. Since
the value of X cannot determine before the experiment, variable X represents a random
variable. X can be also being occurrence of an event like number of telephone call received
randomly during a given time.
Population (or universe): is the aggregate of statistical data forming a subject of
investigation
Statistic
A statistic is a numerical descriptive measures calculated from a sample. In other words, it
represents the summary measures that describe the characteristics of a sample. In most cases,
it refers to sample mean and sample variance. If X 1, X2. . . Xn are a random sample, then

n n

x i  ( x  x)
i
2

X i 1
S2  i 1
n is called a sample mean and n 1 is called a sample variance. The

value of X and S2 represent a statistic.


For example: Consider a population consists of five observations: 3, 6, 9, 12, and 15. If a
random sample of size n=3 is selected without replacement, find the sample mean and sample
variance (statistic for the sample drawn).
Solution: suppose the sample drawn from the population is 3, 6, 9, then
n

X i
3 6  9
X i 1
 6
N 3
Xi Xi - X (Xi- x )2
3 -3 9
6 0 0
9 3 9

 ( x  x)
i
2
 18


( x  x) 2
18
S 2 i
 9
n 1 2
S 92

Values such as X =6, s  9 represents a statistic, summary value of the sample.


2

Parameter
A parameter is a numerical descriptive measure that characterizes a population. In other
words a summary measure that describes any characteristic of the population. Since it is
determine based on observations of population, the value of parameters are unknown in the
case of large population. Parameters include population mean and variance among others.
The mean and the variance of the above population represent the parameter of a given

3  6  9  12  15
 9
population. That is 5 representing a parameter of a population that
populations mean. Here we can determine population parameter since the population under
the study is finite.
Sampling distribution
Sampling distribution provides the basis for determining the level of confidence or reliability
with which a particular value of a given sample statistics can be used as an estimate of the
parameter. It also serves as the necessary ground for evaluating a particular hypothesis stated
with reference to a parameter. Both these processes require a clear understanding of the
various sampling distributions and their properties defining the relationships between a given
sampling statistics and the corresponding population parameter. Therefore, let us first
describe what a sampling distribution means and understand the properties of different
statistical sampling distribution.
As stated in the introductory part, sampling is used alternative to censuses to determine the
characteristic of a population. That is a random sample of a given size is taken from a given
population up on which we based to estimate the parameters of a given population. However,
when the samples drawn repeatedly from a population, the sample may or may not be a
representative sample. In other words, sample statistics such as sampling mean and variances
are random variables because different samples can lead to different values of sample
statistics.
Since the value of a statistic for different samples has its own number of occurrence
(frequency), based on the frequency of occurrence, the probability for obtaining a given
statistic can be determined. A sample statistic associated with its probability of occurrence
represents sampling distribution.
Definition: The sampling distribution of a statistic is the probability distribution for the
possible values of the statistic that results when random samples of size n are repeatedly
drawn from the population.
Example: Consider a population consists of N = 5 numbers 3,6,9,12,15. If a random sample
of size n=3 is selected without replacement find the sampling distributions for the sample

mean, X .
Solution: There are 10 possible random samples of size n=3 and each sample have equally

1
likely draw with probability of 10 . These samples, along with the calculated value of X
are given as follows:

Sample Sample values X (sample mean)


1 3, 6, 9 6
2 3, 6, 12 7
3 3, 6, 15 8
4 3, 9, 12 8
5 3, 9, 15 9
6 3, 12, 15 10
7 6, 9, 15 9
8 6, 9, 15 10
9 6, 12, 15 11
10 9, 12, 15 12
Sampling distributions for the sample mean
X P(X )
6 0.1
7 0.1
8 0.2
9 0.2
10 0.2
11 0.1
12 0.1

Exercise Given the age of 5 children as follows

Child ( X ) Age
1 2
2 4
3 6
4 8
5 10
If a random samples of size 2 without replacement drawn from this population, construct the
sampling distribution of sample mean.

5.2. Distribution of the Sample Mean


A) Sampling from Normal Distributions
Suppose X is normally distributed with mean μ and variance σ2; that is, X~N(μ, σ2). Let X1,
X2, ---, Xn be a random sample from a random variable X. Then, sample mean is given as:
This sample mean is a linear function of n normally distributed samples (where n is the
size of the sample). Hence, is also normally distributed. That is,

Therefore, on average, the mean of the sampling is equal to the population mean.

NB: As the sample size increases, the sample mean concentrates to the population mean. As n
approaches infinity, the sample mean just coincide with the population mean.

Example 1: It is noted from the past observation that the incomes of the household in a
certain village are approximately normally distributed with a mean weekly income of 30 Birr
and variance of 36. Samples of 25 sizes of households are to be selected and their incomes
are recorded.
a) Find the probability that the sample mean will fall within three birr of the population mean.
b) How many observations should be included in the sample if we wish the sample mean to
be within 2 Birr of the population mean with the probability of 0.95?

B) ) Sampling from a Non-normal Population

Suppose X is a non-normally distributed with mean μ and variance σ2. In this case we going
to approximate the sample mean by using normal distribution (based on the following
theorem).

Central Limit Theorem (CLT): Is one of the most important theorems in statistics. In
selecting simple random samples of size n from a population, the sampling distribution of the
sample mean ( X́ ) can be approximated by a normal probability distribution as the sample size
becomes large.

CLT: If random samples of n observations are drawn from a population with any probability
distribution with mean µ and standard deviation δ, then, when n is large, the sampling

distribution of the sample mean X is approximate normal distribution with mean µ and

δ2
variance n .
2
X : N (, )
n
Even if we are sampling from a non-normal population, we can use a normal distribution if n≥30
as an approximation to a sample mean
Example 2: A soft drink vending machine is set so that the amount of drink dispensed is a
random variable with mean of 200 milliliters and standard deviation of 15 milliliters. What is
the probability that the mean amount dispensed in a random sample of size 36 is at least 204
milliliters?

Solutions: The distribution of X has mean X = 200 and standard error of

δ 15
δ = = =2. 5
X
√ n √36
According to central limit theorem, the sample mean approximately normally distributed and

X  204  200
Z   1.6
can be converted to standard normal as SE ( X ) 2.5

The probability that the sample mean greater than 204 is P( X  204)  P( Z 1.6) . From
the standard normal, Z- table P (Z > 1.6) = 0.0548. From the result we can concluded that the
probability that sample mean will be greater than 204 is equal to 0.0548.

Exercise: A bulb manufacture claim that the life of its bulb is normally distributed with
mean 36,000 hours and standard deviation of 4,000 hours. A random sample of 16 bulbs had
on average life of 34,500 hours. If the manufactures claim is correct what is the probability
that the sample mean smaller than 34,500 hours.

5.3. Distribution of the Sample Variance

The Chi-square Distribution

In the proceeding section, we have studied some of the properties of sampling distribution of
sample mean. In this section, we will consider sampling distribution of variance that is used
for inference about population variance. Consider a random sample of n observation drawn
from population with unknown mean and unknown variance δ2. If the sample members are x1,
x2 -------xn,
The population Variance δ2 is defined as:
δ 2=E [ ( x−μ)2 ]
2
The sample Variance, S is defined as
1
S 2= ∑ ( X i− X )2
n−1 Moreover, its square root is termed as sample standard
deviation.
Here we use n-1 to find sample standard deviation for a random sample of n- observation.
This is because we computed sample mean and left with n-1 different value that can be
uniquely defined.
Given the above definition of sample variance, let us define its mean and distribution.
The mean (the expected value) of sample variance is equals to population variance.
E (S2) = δ2
2
S 2
=
∑ ( X i −X )
Proof n−1 from the chapter on expectation

n
2
∑ ( X i−X )2 = ∑ [ ( X i−μ ) − ( X −μ ) ]
n−1

= ∑ [ ( X i −μ )2 − 2 ( X−μ ) ( X i −μ ) + ( X −μ )2 ]
2
= ∑ ( X i −μ )2 − 2 ( X−μ ) ∑ ( X i−μ ) + ∑ ( X−μ )
= ∑ ( X i −μ )2 −2 n ( X −μ) 2 +n ( X−μ )2
= ∑ ( X i −μ )2 −n ( X−μ )2
Taking the expectation

E [ ( Xi−X )2 ] = E [ ∑ ( Xi−μ)2 ] − nE [ ( X−μ )2 ]


n
= ∑ E [ ( Xi−μ )2 ] − n E [ ( X −μ )
2
]
i=1

The expectation of ( X i   ) is the population variance δ2, and the expectation of


2
( X −μ )2

δ2
is the variance of the sample mean, that is, n .Hence we have

n
nδ 2
E
[ ∑ ( X i− X )2 = nδ 2−
i =1 ] n
= ( n−1) δ 2
1
E (S2) = E
[ n−1
∑ ( X i −X )2
]
1
= E ( ∑ ( X i− X )2 )
n−1
1
= ( n−1 ) δ2 = δ 2
n−1
So ⇒ E (S 2 )=δ 2

This implies, sample variance, 2 is unbiased estimator of population variance, 2. This means
that, in a repeated sampling, the average of all your sample estimates will equal the target
parameter, 2.
As we have seen in the preceding topics, identifying the distribution of sampling distribution
of a sample statistic is essential to make inference about a parameter of population.
Therefore, let us identify the distribution of sample variance.
Consider the distribution of 2 on a repeated random sampling from a normal distribution.
Theoretically, since variance cannot be negative, the sampling distribution of sample variance
starts from 2 = 0. Its shape is non-symmetric and changes with each different sample size
and value of 2. As we standardize random variables by forming Z-distribution, sample
variance can be standardized and form a distribution called chi-square distribution. Given a
random sample of n observations from a normally distributed population whose population

2
(n−1 )S
has a chi square (  )
2
variance is 2 and resulting sample variance is s2, and then δ2
distribution with n-1 degree of freedom.

F(
 v2 ) Chi square distribution
(
 v2 ) Chi square

When the population mean  is not known, a particular sample mean X based on a random

sample of size n may be used as an unbiased estimate of . Therefore, we can define  v as


2

(X
2
X 
2
X X  X )2
  i     i    i

2
 

Since sample variance defined as

s2 
(X i  X )2
or (X  X )2  (n  1) S 2
i
n 1

We can define  distribution as


2

( n  1) S 2
 v2 
2
Chi square has many important applications. Some of its application (uses) is
- Test of independence of attributes
- Test of goodness of fit
- Test for the equality of population variance and test for homogeneity

The calculated value of  is compared with the critical value at a particular level of
2

significance and degree of freedom. If  cal   critical , then the null hypothesis is rejected in
2 2

favor of the alternative hypothesis.

The chi square distribution has several important mathematical properties. Some of them are
the following.
1. If X1, X2 - - - Xn are independent random variables having standard normal
distributions, then
n
Y   X i2
i 1 Has the chi-square distribution with V=n-1 degree of freedom.
2. If X1, X2 - - - Xn are independent random variables having chi-square distribution with
V1, V2, - - - Vn degrees of freedom, then
n
Y   Xi
i 1 Has the chi-square distribution with V1 + V2 + V3 - - - Vn degree of
freedom.
3. The mean and variance of chi-square distribution are equal to the number of degree of
freedom and twice the number of degree of freedom

E     V and var   
2

v
2

V
 2V
Where V is the degree of freedom
That is
  n  1  2  n 1
E   E  S2  but E ( 2 )   2
  2
2

n 1
 2  2  n 1

To obtain the variance of S2
  n  1 2 
2
 n 1
Var  s    2  var( s 2 )
    
2

2 2
Var ( 2 ) 
n 1
 n  1
2
2 2

2 (n  1)
2( n  1)

For many applications involving, the population variance we need to find values for the

cumulative distribution of  , especially the upper and lower tails of the distribution. To
2

make inference about the population variance the calculated value of  is compared with
2

tabulated value of  for the given level of degree of freedom. For convenience of
2

interpretation, the  values listed under any column headed by specific value of  may be
2


2

. It means the probability is  that a random sample size n produces a 


2
denoted as  ,v .


2

value greater than the tabulated value . for d.f V = n-1.


For example, the tabulated  value for v = n - 1 = 10 d.f under the column heading  = 0.05
2


2

= 18.3. It means the probability is  = 0.05 that the 


2
is 0.05 value computed from a


2

sample size of n = 11 is greater than 0.05,10


= 18.3
f(  2)

=0.05


2

This 0.05 =18.3, the area  = 0.05 is the probability that X2 value based on sample of size


2

n=11 is greater than 0.05 = 18.3


2

Tabulated value of X 2
distribution  with v=10 above which the area is .
The probability above can be stated as


P  v2     P  X  18.3  P  18.3  X       005
2

0.05
2 2

P   18.3  P  0    18.3  1    0.95A 


2 2 2
10
= is the area to its right under
v

the chi square curve with v degree of freedom is equal to .


2
P (X 0. 05 < K u )= 0. 05−upper tail
X
2
P( 0. 05 < K L )=0 . 05−lower tail

Where Ku = upper tail critical value


KL – lower tail critical value
P  2

10
 3.94  0.05 
P( 
2
 18.31)  0.05
10


2

That is, x,y


is such that if X is a random variable having a chi square distribution with v
degree of freedom, then


P x 
2

x, y  
Example
A cement manufacturer claims that concrete prepared from his product has a relatively stable
compressive strength and that strength measured in kilograms per square centimeter lies

kg
40
within a range of cu 2 . A sample of n=10 measurements produced a mean and a

variance equal to, X =312 and s2 = 195. Do these data present sufficient evidence to reject
the manufacturer’s claim d.f the population variance is equal to 100.
The claim of the manufacturer can be rejected if the calculated value of chi square exceeds


2
 16.919
the critical value of 0.05,9
from the table
( n  1) s 2 9(195) 175
2     17.55

2
100 100

Since the observed value of chi square value 17.55 is greater than the critical value, we can
reject the manufacturer claim.

5.4. The F-distribution


There are cases where statistical analysis involves comparing two-population variance. You
might need to compare the precision of one measuring device with that of another, the
stability of one manufacturing process with that of another, or even the variability in the
grading procedure of one college instructor with that of another.
2 2

One-way to compare two population variances, δ 1 and δ 2 , is to use the ratio of the

2
s 1
2

sample variances, s 2 . When independent random samples are drawn from two normal
populations with equal variance from two normal populations with equal variance,

   
2 2
then s 1
2
1 2
s 2 has a probability distribution in repeated sampling that is termed as
F-distribution.
F-distribution is a sampling distribution of the ratio of two independent random variables
with chi square distributions, each divided by its respective degree of freedom. If U and v are
independent random variables having chi square distributions with v1 and v2 degree of
freedom, then


2

u 1
v1 v1
F 
v

2
v2 2
v2
Is a random variable having F-distribution whose values vary with every set of two samples
of size n1 and n2.

 and 
2 2

Substituting the value of 1 2 in the above equation


2
( n1  1) s 1
n1  1

2

F 1
2
( n2  1) s 2

2
2
n2  1

F 
2 2
s 2 1

s
1 2
1 2

2 2

If δ 1 and δ 2 are the variances of independent random samples of size n 1 and n2 from

2 2
F  v1 ,v2   1 s1
 2 2
 2s 2
2 2

normal populations with the variances δ 1 and δ 2 then is a


random variable having F-distribution with n1-1 and n2-1 degrees of freedom.
The critical values of F-distribution are tabulated like the case of chi square and Z-
distribution. F, v1, v2 represent area to its right under the curve of F-distribution with v 1 and
v2 degree of freedom is equal to . That is F, v1, v2, is defined as P(F > f, v1, v2)=
Example: if the value of V1 = 10 V2 = 20  = 0.05
F10, 20, 0.05 = 2.35

P(F10, 20 > 2,35) = 0.05

To test whether the variance of two populations is equal or not, compare the calculated value
of F with the critical value of F.
Example: The research staff of investors was interested to determine if there is a difference
in the variance of maturities of AA-rated industrial bond and CC-rated industrial bonds. A
2

random sample of AA-rated bonds resulted in a sample variance s X


 123.35
and an
2

independent random sample of 11 cc- rated bonds resulted in a sample variance S 2


 8.02

whether the population variance of the two populations is.

5.5. The t-distribution


In the previous topics, we have seen that for random samples from a normal population with
mean  and variance 2, the random variable Z, has a normal distribution with mean  and

X−μ
2
δ δ
variance n . In other word √ n has standard normal distribution if the size of
sampled population is normal or the size of sample used is large. However if the sample size
used is small making inference about the population mean based on z-distribution as test
statistics involves two type of problems.

1. The shape of the sampling distribution of X and /or Z statistics depend on the shape
of population sampled. We can no longer assume that the distribution of X is
approximately normal, because central limit theorem ensures normality only for
sample that are sufficiently large. However, the sampling distribution of sample mean
is normal if the sampled population is normal.
2. The population standard deviation is always unknown. Even though it is possible to
estimate the population standard deviation with the sample standard deviation s, it is
poor approximation of for population standard deviation when the sample size is
small.
In the case where the population standard deviation is unknown, standard normal statistic
cannot be used. It is natural to replace the unknown  by the sample standard deviation, s.
These will gives a distribution called student t-distribution after Gosset who developed the

X 
t
s
probability distribution of the statistic n .

Given a random sample of n observations, with mean X and standard deviation s, from
normally distributed population with mean , the random variable t, follows the student’s t
distribution with (n-1) degrees of freedom .The shape of the student’s t distribution is rather
similar to that of the standard normal distribution. Both distributions have mean zero, and the
probability density functions of both are symmetric about their mean. However, the density
function of the student’s t distribution has a larger dispersion (variability) than the standard
normal distribution. The actual amount of variability in the sampling distributions of t
depends on the size of the sample n.
As the number of degree of freedom increases, (sample size increases) the student’s t-
distribution becomes increasingly similar to the standard normal distribution. This is
intuitively reasonable and follows from the fact that for a large sample size, the sample
standard deviation is a very precise estimator of the population standard deviation. In
particular, the small the degree of freedom associated with the t-statistic, the more the
variable will be its sampling distribution.

If xi is the n sample values drawn from a normal population with mean  and variance 2, the

x i−μ
Z=
δ
standard normal random variable can be defined as: √n which follows a normal
distribution with mean 0 and variance 1. For the same n sample values, the square of standard
2

normal distribution Z i gives a  variable, which follows a  distribution with n-1


2 2

degree of freedom given as:


2
(X  X )
Y  Z   i 22

i

A sample statistic t can be also defined as the ratio of the standard normal Z to the square root
of chi square distribution –Y.
Z X 
t where Z 
y 
n 1 n
y
2
Z i

x

n X 
t
 X i X  1  X i X 
1  2
n n 1
n n 1

X 
t
S n

t- Values in a repeated sampling follow a t-distribution with n-1 degree of freedom.


By virtue of the central limit theorem as n tends to be large, the sample standard error
approaches population standard deviation. That is when n becomes large; the t-statistic
approaches the standard normal variable. That is If n>30, then s   so that

X  X 
t  Z 
 
n n .

In order to base inferences about population mean on student’s t-distribution critical values
are tabulated for different degree of freedom. t,v represents the area to the right under the
curve of the t-distribution with v degrees of freedom is equal to . That is if t is a random
variable having t-distribution with v degrees of freedom, then P (t > t,v) =  .Since the
density function is symmetric, t1-,v = -t,v
The tabulated t values are denoted by t. The area under the t-distribution curve above t is 
and the one below t is 1-.
t1-=-t t
Tabulated t and t1-=-tvalues
The areas under the t distribution curve can be interpreted in terms of probabilities by taking,
say a t distribution based on a sample of size n=15. Thus for v=n-1 = 14, the t value above
which the area under t-curve is =0.05, is t0.05=1.76. It means the probability of t value
computed for a random sample of size n=15 being greater than t 0.05=1.76 is =0.05 and may
be stated as
P(t > t0.05) = P(t > 1.76)=0.05
Similarly, the t value below which the area under the t-distribution curve is =0.05 is t0.95 =
-t0.05 = -1.76. It means the probability of t value based on a random sample of size n=15 being
less than –t0.05 = -1.76 is =0.05. It is stated as P(t < -t0.05) = P (t < -1.76) = 0.05

=0.05 =0.05

-1.76 1.76

There are various uses of t-distribution. A few of them are


- Hypothesis testing for the population mean.
- Hypothesis testing for the difference between two populations means with
independent samples.
- Hypothesis testing for the difference between two populations means with dependent
samples.
- Hypothesis testing for an observed coefficient of correlation including partial and
rank correlations
- Hypothesis testing for an observed regression coefficient.

You might also like