Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

CIVL101

Lecture-27-28

Mathematical Statistics-
Test of Significance
Unit 6: Applications of Probability and Statistics

Learning Outcomes:

To know about the concepts of Mathematical Statistics.


Population: A population is the entire group that you want to draw
conclusions about. In research, a population doesn’t always refer to people.
It can mean a group containing elements of anything you want to study,
such as objects, events, organizations, countries, species, organisms, etc.
Population can be finite or infinite.

Sample: A sample is the specific group that you will collect data from. The
size of the sample is always less than the total size of the population. The
number of individuals in the sample is called the sample size. The process
of selecting these samples is called sampling.
Parameters and Statistics

In statistics vocabulary, we often deal with the terms parameter and


statistic, which play a vital role in the determination of the sample size.
Parameter implies a summary description of the characteristics of the
target population. On the other extreme, the statistic is a summary value
of a small group of population i.e. sample.

The parameter is drawn from the measurements of units in the


population. As against this, the statistic is drawn from the measurement
of the elements of the sample.
Using calculated Statistics to estimate Parameters
Comparison between Statistic and Parameter:
Polling Quiz

Which of the following can not be measured or calculated exactly:

(A) Sample size.

(B) Sample mean.

(C) Sample standard deviation.

(D) Population mean.


Sampling distribution of a Statistic

Consider a population of size N. Let r samples are drawn each of size n.

Now, we compute some statistic, say the mean 𝑥ҧ or variance 𝑠 2 for each
of r samples.

We call this data of values of the statistic as a sampling distribution.

The standard deviation of this sampling distribution is called standard


error (SE) of the statistic.
Generally, if the size of the sample 𝑛 ≥ 30 , then the sample is
considered as large sample.

If the sample size is large, then the following assumptions hold:

1. The sampling distribution of a statistic is approximately normal.

Note that the distribution of the population may or may not be normal.

2. Sample statistics approximate the corresponding population


parameters.
Two Types of Problems in Sampling Theory of Large Samples

1. Estimation. Some characteristic of the population, in which we are


interested, is not known. We choose a random sample and obtain
information about this characteristic, which is taken as an estimate
of the characteristic of the population. This is called the problem of
estimation.
2. Testing of Hypothesis. Some information about the characteristic of
the population is known. We wish to know whether this information can
be accepted. We choose a random sample and obtain information about
this characteristic. Based on this information, we conclude whether the
available information of the characteristic of the population can be
accepted or rejected. We also wish to know that if it can be accepted
then, to what degree of confidence it can be accepted. This is called the
problem of testing of hypothesis.
Test of Significance

Let 𝜃 be a parameter of the population and 𝜃0 be the corresponding sample


statistic. Since 𝜃0 is based on a random sample, there will be some
deviation (difference) between 𝜃 and 𝜃0 . This may be due to the reason that
the selection of the sample is not completely random. If the difference is
large, we say that the difference is significant. If 𝜃1 is the statistic obtained
from a second random sample, we wish to know whether the difference
between 𝜃 and 𝜃1 is significant. The methods that are used to decide
whether the difference is significant or not, are called tests of significance.
Polling Quiz

A particular characteristic of population is studied by using:

(A) Calculated Parameter.

(B) Calculated Statistic.


Null Hypothesis and Alternate Hypothesis

A population is given to us and we wish to have information about the


characteristic of the population. We start with the assumption that there
is no significant difference between the sample statistic and the
corresponding population parameter or between two sample statistics.

This assumption that there is no significant difference is called Null


hypothesis and is denoted by 𝐻0 .

A hypothesis that is different from the null hypothesis is called an


Alternate hypothesis and is denoted by 𝐻1 .
The methods that are used to decide whether to accept or reject a null
hypothesis or an alternate hypothesis are called tests of hypothesis.

Let 𝜃 be a parameter of the population and 𝜃0 be the corresponding sample


statistic.

Then we define the null hypothesis as 𝐻0 : 𝜃 = 𝜃0

The alternate hypothesis are defined as:

I. 𝐻1 : 𝜃 ≠ 𝜃0 (Two tailed)

II. 𝐻1 : 𝜃 > 𝜃0 (Right tailed)

III. 𝐻1 : 𝜃 < 𝜃0 (Left tailed)


Test Statistic

In large sample theory, SE forms the basis of the testing of hypothesis.

For a large sample, if 𝑡 is any statistic, then it follows a normal distribution.

The corresponding population parameter is mean = 𝐸(𝑡) and standard


deviation = 𝑆𝐸(𝑡).

𝑡−𝐸(𝑡)
For large samples: 𝑍 =
𝑆𝐸(𝑡)

Which is called the test statistic.


Critical Region

Let the sample statistic 𝑡 lie in a certain region 𝑅 in the sample space. If
we decide that the difference between the parameter of the population
and the sample statistic is significant ( that is the null hypothesis is
rejected), then, this region 𝑅 is called the critical region or the region of
rejection.

The complementary region 𝑅ത is called the region of acceptance.


Level of Significance

The probability that a random value of the statistic lies in the region 𝑅 or 𝑅ത is
written as: 𝑃 𝑡 ∈ 𝑅 𝐻0 = 𝛼, 𝑃 𝑡 ∈ 𝑅ത 𝐻1 = 𝛽

We call 𝛼 as the level of significance.

Critical Value

The value of the test statistic 𝑍, which separates the rejected region and the
accepted region is called the critical value or the significant value of 𝑍.

The critical value of 𝑍 for a single tailed test at the level of significance 𝛼

is the same as critical value of 𝑍 for a two tailed test at level of significance 2𝛼.
Errors in testing of hypothesis

As mentioned earlier, in sampling theory, we draw conclusion about the


population parameters on the basis of investigations of random samples.

Also the level of significance is fixed in advance, which may make the
region of rejection larger or smaller.

Because of these reasons, two types of errors can arise in the testing of
hypothesis.
Type I Error:

We reject the null hypothesis 𝐻0 , when it is true.

That is, we reject a consignment of items, when the items were good.

This type of error is called the Producer’s risk.

Type II Error:

We accept the null hypothesis 𝐻0 , when it is not true.

That is, we accept a consignment of items, when the items are not good.

This type of error is called the Consumer’s risk.

You might also like