Chapter 3 - Sampling - Part I PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Chapter 3: Sampling, Sampling Distributions

And Testing

Sampling
• 3.1 Introduction
• 3.2 Population And Sample-Universe of Population
• 3.3 Types Of Population- Sample, Advantages of Sampling
• 3.4 Sampling Theory- Law of Statistical Regularity- Principle Of Inertia of Large
Numbers
• 3.5 Terms used in sampling theory
• 3.6 Types Of Sampling
• 3.7 Central Limit Theorem
• Testing Of Hypothesis In case of Large and Small Samples
• 3.8 Introduction
• 3.9 Testing Hypothesis
• 3.10 Selecting a Significance Level
• 3.11 Two Tailed Test with case study
• 3.12 Classification Of Test statistics
• 3.13 Testing of Hypothesis in case of Small samples
• 3.14 t distribution, uses of ‘t’ test
• Chi-Square Test
• 3.15 Introduction
• 3.16 2 Test –Degrees of Freedom
• 3.17 Practical Applications
• 3.18 Chi-square distribution
• 3.19 Uses, Application of Chi Square
• 3.20 Test for independence of attributes
Scope
• Central limit theorem , principle of inertia of
large numbers (only statement)
• Sampling probability and non probability
sampling methods
• Theory related to testing and only t test (small
sample) all 3 i.e. test Single mean , equality of
means , paired t test
• Chi square test (test of goodness of fit and
independence of attributes)
• Problems based on these 5 test no derivation
required only application i.e. problems
• Population/Universe: Population means aggregate of all possible units. It need not be human
population. It may be population of plants, population of insects, population of fruits, etc
• Population : It refers to any group of people or objects that form the subject of study in particular
survey and are similar in one or more ways. OR The aggregation of people/subjects to which one
wishes to generalize his/her research findings.
– Finite population: When the number of observation can be counted and is definite, it is known as finite
population

• No. of plants in a plot.


• No. of farmers in a village.
• All the fields under a specified crop.
– Infinite population: When the number of units in a population is innumerably large, that we cannot count
all of them, it is known as infinite population.

• The plant population in a region


• The population of insects in a region

• Census: An examination of each and every element of the population. It is also known as
population survey and complete enumeration survey. Under census survey the information are
collected from each and every unit of the population or universe.
• Frame : A list of all units of a population is known as frame.
• Sampling Frame: Sampling frame comprises all the elements of a population with proper
identification that is available to us for selection at any stage of sampling
Population and Sample
Population

1 6 1 5
5 2 1 2 4 2
7 1 5
2 3
4 3
5 3
4
6 3 2 5
6 1
4 2 4
7 3 1 5
5
2 3 4 6
8 1 5 4 3
1
1 1
8 6 1 2
6 4
1 9 1 5
3 3
0 9
5

3 2 3
Sample 1
5 5
6
Element
5
5 4
6 5
5
1 4
6 3
• Parameter: A summary measure that describes any given characteristic of the
population is known as parameter. Population are described in terms of certain
measures like mean, standard deviation etc. These measures of the population are
called parameter and are usually denoted by Greek letters. For example, population
mean is denoted by μ , standard deviation by σ and variance by σ2 .
• Sample :A portion or small number of unit of the total population is known as
sample. OR It is a subset of the population. A group of elements or subjects
selected from population for study.
– All the farmers in a village(population) and a few farmers(sample)
– All plants in a plot is a population of plants.
– A small number of plants selected out of that population is a sample of plants.
• Sampling Unit/Element : A single member of population. The unit of study-
people or subject. OR It is a single member of the sample.
• Statistic: A summary measure that describes the characteristic of the sample is
known as statistic. Thus sample mean, sample standard deviation etc is statistic.
The statistic is usually denoted by roman letter.
– -sample mean
– s – standard deviation
– The statistic is a random variable because it varies from sample to sample.
• Sampling: The method of selecting samples from a population is known as
sampling.
• Sampling: It is the process of selecting a sufficient number of elements
from the population, so that a study of the sample and understanding of its
properties of characteristics would make it possible for generalization of
properties or characteristics
• Sampling technique There are two ways in which the information is
collected during statistical survey. They are

– Census survey
– Sampling survey
• Sample survey: A sample is a part of the population. Information are
collected from only a few units of a population and not from all the units.
Such a survey is known as sample survey.
• Sampling technique is universal in nature, consciously or unconsciously it is
adopted in every day life.
• For eg.

– A handful of rice is examined before buying a sack.


– We taste one or two fruits before buying a bunch of grapes.
– To measure root length of plants only a portion of plants are selected from a plot.
Sampling Vs. Non-Sampling Error
• Sampling Error: This error arises when a sample is not
representative of the population.
• Non-sampling Error: This error arises not because a
sample is not a representative of the population but
because of other reasons like
– Respondents not given correct answers
– Error arise due while transferring data from questionnaire to
spreadsheet.
– Error at time of coding, tabulation and computation.
– Population of the study is not defined properly.
– Chosen respondents are not available or refuse to answer
– Sampling Frame error
Need for sampling
• The sampling methods have been extensively used for a variety of
purposes and in great diversity of situations.
• In practice it may not be possible to collected information on all
units of a population due to various reasons such as
• Lack of resources in terms of money, personnel and equipment.
• The experimentation may be destructive in nature. Eg- finding out
the germination percentage of seed material or in evaluating the
efficiency of an insecticide the experimentation is destructive.
• The data may be wasteful if they are not collected within a time
limit. The census survey will take longer time as compared to the
sample survey. Hence for getting quick results sampling is preferred.
Moreover a sample survey will be less costly than complete
enumeration.
• Sampling remains the only way when population contains infinitely
many number of units.
• Greater accuracy.
Principles of Sampling
1. Principle of “Statistical Regularity”: This
principle lays down that a moderately large
number of items chosen at random from a large
group are almost sure on an average to possess
the characteristics of the large group.
2. Principle of “Inertia of Large Numbers”: this is
principle is corollary of the above principle.
It states that, other things being equal, larger the
size of sample, more accurate the results are
likely to be.
Sampling Methods
I. Probability II. Non-probability
Sampling sampling
1. Random Sampling
1. Convenience/
– With replacement
Accidental
– Without
replacement 2. Quota Sampling
2. Systematic Sampling 3. Purposive/
3. Stratified Sampling judgmental
– Proportionate Stratified 4. Snowball Sampling
– Disproportionate
Stratified
4. Cluster Sampling
Probability Sampling

• One can specify , for each element of the population, the


relative likelihood that it will be included in the sample.
• In probability sampling, each and every element of the
population has a known chance of being selected in the
sample.
• It relies on a random selection of elements
• It is used in case of ‘Finite Population’
• It is used in conclusive research.
• The known chance does not mean equal chance.
Random sampling
• Random sampling : Under this method, every unit of
the population at any stage has equal chance (or) each
unit is drawn with known probability. It helps to
estimate the mean, variance etc of the population.
• Under probability sampling there are two procedures
– Sampling with replacement (SWR)
– Sampling without replacement (SWOR)
• When the successive draws are made with placing back
the units selected in the preceding draws, it is known
as sampling with replacement. When such replacement
is not made it is known as sampling without
replacement. When the population is
finite sampling with replacement is adopted otherwise
SWOR is adopted.
1.Simple Random Sampling
• A method of sampling that relies on a random or chance selection
method so that every element of the sampling frame has a known
probability of being selected.
• Simple random sampling is not used in consumer research as the
population size is usually very large, which creates problems in the
preparation of the a sampling frame.
• Merits
There is less chance for personal bias.
– Sampling error can be measured.
– This method is economical as it saves time, money and labour.
• Demerits
It cannot be applied if the population is heterogeneous.
– This requires a complete list of the population but such up-to-date
lists are not available in many enquires.
– If the size of the sample is small, then it will not be a representative
of the population.
1. Simple Random sampling(SRS)
• The basic probability sampling method is the simple
random sampling. It is the simplest of all
the probability sampling methods. It is used when the
population is homogeneous. When the units of the sample
are drawn independently with equal probabilities.
The sampling method is known as Simple
Random Sampling(SRS). Thus if the population consists of
N units, the probability of selecting any unit is 1/N.

• A theoretical definition of SRS: Suppose we draw a sample


of size n from a population of size N. There are NCn
possible samples of size n. If all possible samples have an
equal probability 1/NCn of being drawn, the sampling is
said be simple random sampling.
1.Simple Random Sampling
• Simple Random Sampling with replacements
– Under this scheme , a list of all the elements of the
population from where samples to be drawn is prepared.
– Select the sample by using slips
– Select the sample by using random tables

• Simple Random Sampling without replacements


– Same as simple random sampling with replacement but
only difference is that without replacement selection takes
place.
2.Systematic Sampling

• It take care of the limitation of the simple random


sampling that the sample may not be representative one.
• In a systematic sampling, the first unit of sample is
selected at random and having chosen this there is no
control over the subsequent units of sample. Due to this
it is called as mixed sampling.
• It is very simple.
• It is easiest and cheap way to select sample
• Not required a complete sampling frame to draw
systematic sample
Systematic Sampling
• In this sampling entire population is arranged in a
particular order. (As per calendar dates, ascending,
descending and/or in alphabetical order etc.)
• First, a sampling interval given by K=N/n is
calculated, Where N= size of population, n= size of
sample (K should be an integer)
• A random number is selected from 1 to K. Let us call it
C
• First number selected would be C, then C+K, C+2k….
till sample size selected
2.Systematic Sampling
• It involves drawing nth element in the
population starting with a randomly chosen
element between 1 to n.
• For market surveys, consumer attitude surveys
etc.
• In this case telephone directory frequently
serves as the population frame for this
sampling design
3.Stratified Random Sampling
• When the population is heterogeneous with respect to the characteristic in which we
are interested, we adopt stratified sampling. When the heterogeneous population is
divided into homogenous sub-population, the sub-populations are called strata. From
each stratum a separate sample is selected using simple random sampling.
This sampling method is known as stratified sampling.
– We may stratify by size of farm, type of crop, soil type, etc.
• A method of sampling in which sample elements are selected separately from
population strata that are identified in advance by the researcher.
• It involves dividing whole population into strata which are mutually exclusive and
collectively exhaustive.
• It is more efficient as compared to simple random sampling as dividing the population
into various strata increases the representatives of the sampling.
• The criteria for stratification should be related to the objectives of the study.
• Example: Avg. monthly sales of cell phones in large, medium and small stores.
• Stratified random sampling as its name implies, involves a process of stratification or
segregation, followed by random selection of element from each stratum.
• The population is first divided into mutually exclusive groups that are relevant,
appropriate and meaningful context of the study.
3.Stratified Random Sampling
• The number of units to be selected may be uniform in all strata (or) may vary from stratum to stratum.
There are four types of allocation of strata

1. Equal allocation
2. Proportional allocation
3. Neyman’s allocation
4. Optimum allocation

• If the number of units to be selected is uniform in all strata it is known as equal


allocation of samples. If the number of units to be selected from a stratum is
proportional to the size of the stratum, it is known as proportional allocation of
samples. When the cost per unit varies from stratum to stratum, it is known as optimum
allocation. When the costs for different strata are equal, it is known as Neyman’s
allocation.

• Merits
– It is more representative.
– It ensures greater accuracy.I
– t is easy to administrate as the universe is sub-divided.
• Demerits
– To divide the population into homogeneous strata, it requires more money, time and
statistical experience which is a difficult one.
– If proper stratification is not done, the sample will have an effect of bias.
3.Stratified Random Sampling

e.g.
• Stratifying customers on the basis of life stages,
income levels and like to study buying patterns.

• Stratifying companies according to size, industry,


profits and so fourth to study stock market reactions.
Types of Stratified Random Sampling
• Proportionate Stratified Sampling
– Sampling Method in which elements are selected
from strata in exact proportion to their representation
in the population.

• Disproportionate Stratified Sampling


– Sampling in which elements selected from strata in
different proportions from those that appear in the
population.
Proportionate Stratified Sampling
Table 1 : Distribution of students according to year in college

Proportion of each
Year Population
class

BSW I 50 .25

BSW II 40 .20
BSW III 30 .15
MSW I 40 .20
MSW II 40 .20
Total 200 100
Proportionate Stratified Sampling

Table 3 : Distribution of students (Proportionate)

Population Sample
Year Break-up Break-up

BSW I 50 12
BSW II 40 12
BSW III 30 12
MSW I 40 12
MSW II 40 12
Total Sample
200 60
(n) =
Disproportionate Stratified Sampling…..
Table 2 : Distribution of students by Disproportion

Sample
Year Proportion
Break-up
BSW I 15 .25
BSW II 12 .20
BSW III 9 .15
MSW I 12 .20
MSW II 12 .20

Total Sample (n) 60 1.00


Proportionate Stratified Sampling
Table 1 : Distribution of employee according to job level

Proportion of each
Job Level Population
class(20%)
Top
Management
10 2

Middle-level
Management
40 8

Lower-level
Management
50 10

Supervisors 100 20
Clerks 500 100
Total 700 140
Disproportionate Stratified Sampling

Table 2 : Distribution of employees (Disportionate)

Job Level Population Disproportionate sample

Top Management 10 7
Middle-level
Management
40 13
Lower-level
Management
50 20
Supervisors 100 30
Clerks 500 70
Total 700 140
4.Cluster Sampling
• In a cluster sampling, the entire population is divided into
various clusters in such a way that the elements within the
clusters are heterogeneous but there is homogeneity
between the clusters.
• Sampling in which elements are selected in two or more
stages, with the first stage being the random selection of
naturally occurring clusters and the last stage being the
random selection of elements within clusters.
• A cluster may not contain heterogeneous elements so
applicability of cluster sampling in research is questionable.
• It is useful when populations under survey are widely
dispersed and drawing a simple random sample may be
impractical.
4.Cluster Sampling…….
Topic : Food Habits of Youth in Pune.
Population : 50,000
Sample : 500
Cluster Sampling : Procedure
Stage I : Selection of One Ward from each circle/zone
of Pune Municipality.
Stage II : Selection of 1000 households from selected
wards.
Stage III : Selection of 500 youth from 1000 Households
Selection of Sample Units
• Tippet Random Numbers
• Lottery Method
• Sequential List
Tippet’s Random Numbers

-----------------------------------------------------------------------------------
29935 03 97 163175 52579 10478
15114 07 82 651890 77787 75510
03870 43 22 510589 87629 22039 3
7
79390 39 68 840756 45259 65959 9
30035 09 91 579196 54428 64819 8
29039 99 86 128759 79802 68531 11
78196 08 10 824107 49777 09599 22
15847 85 49 391442 91391 80130 10
36614 62 24 849194 97209 92587 49
40549 54 88 491465 43862 35541 24
40878 11 54 714286 09982 90308 54
--------------------------------------------------------------------------------
Lottery method
• This is most popular method and simplest method. In this method all
the items of the universe are numbered on separate slips of paper of
same size, shape and color. They are folded and mixed up in a drum
or a box or a container. A blindfold selection is made. Required
number of slips is selected for the desired sample size. The selection
of items thus depends on chance.

• For example, if we want to select 5 plants out of 50 plants in a plot,


we number the 50 plants first. We write the numbers from 1-50 on
slips of the same size, role them and mix them. Then we make a
blindfold selection of 5 plants. This method is also called
unrestricted random sampling because units are selected from the
population without any restriction. This method is mostly used in
lottery draws. If the population is infinite, this method is
inapplicable. There is a lot of possibility of personal prejudice if the
size and shape of the slips are not identical.
Random number table method
• As the lottery method cannot be used when the population is infinite, the
alternative method is using of table of random numbers. There are several
standard tables of random numbers. But the credit for this technique goes to
Prof. LHC. Tippet(1927). The random number table consists of 10,400
four-figured numbers. There are various other random numbers. They are
fishers and Yates(19380 comprising of 15,000 digits arranged in twos.
Kendall and B.B Smith(1939) consisting of 1,00,000 numbers grouped in
25,000 sets of 4 digit random numbers, Rand corporation(1955) consisting
of 2,00,000 random numbers of 5 digits each etc.,
Suggest appropriate sampling design
• The director of human resources of a manufacturing firm
wants to offer stress management seminars to the personnel
who experience high levels of stress. He conjectures that three
groups are most prone to stress: the workmen who constantly
handle dangerous chemicals, the foremen who are held
responsible for production quotas, and the counselors who, day
in and day out , listen to the problems of the employees,
internalize them and offer them counsel.
Sampling : methods

II. Non-probability sampling


1. Convenience/ Accidental
2. Quota Sampling
3. Purposive/ judgmental
4. Snowball Sampling
Non-Probability Sampling
• In case of non-probability sampling design, the
elements of the population do not have any
known chance of being selected in the sample.
• In which it is not possible to specify , for each
element of the population, the relative likelihood
that it will be included in the sample.
• It is used in exploratory research
• Random selection of elements is not necessary.
• It is used in case of ‘Infinite Population’
1.Convenience Sampling
• Convenience sampling is often used in the pre-test phase
of a research study such as pre-testing of questionnaire.
• Convenience sampling is used to obtain information
quickly and inexpensively.
• The only criterion for selecting sampling units in this
scheme is the convenience of the researcher or the
investigator.
• It is commonly used in exploratory research.
• Example: Interview conducted by TV channel of people
coming out of cinema hall, to seek their opinion about the
movie.
2.Quota Sampling
• In quota sampling, the sample includes a minimum
number from each specified subgroups in the
population.
• In quota sampling, sample is selected on the basis of
certain demographic characteristics such as age, gender,
occupation, education etc.
• It is not required a sampling frame, is economical and
does not take too much time to set up.
• It may look similar to stratified sampling.(Selection is
on basis of convenience or judgment of the researcher
and results are not generalized.)
3.Purposive/Judgmental Sampling
• Under judgmental sampling, experts in a particular
field choose what they believe to be the best sample
for the study in question.
• In this sampling, the judgment of an expert is used to
identify a representative sample.
• This sampling calla for special efforts to locate and
gain access to the individuals who have the required
information.
• It is used when the required information is possessed
by a limited a number or category of people.
4.Snowball Sampling
• It is generally used when it is difficult to
identify the members of the desired
population.
• Example: families with triplets, people using
walking sticks etc.
• Under this research respondent is being
interviewed and asked to identify one or more
in the field.
Suggest appropriate sampling design

• A company is considering operating an on-site kindergarten


facility. But before taking further steps, it wants to get the
reactions of four groups to the idea: Employees who are
parents of kindergarten-age children, and where both are
working outside of the home, employees who are parents of
kindergarten-age children, but where one of them is not
working outside of the home, single parents with kindergarten-
age children and all those without children of kindergarten-
age. Suggest the suitable sample design method.
Sampling Theory- Law of Statistical Regularity- Principle Of Inertia of
Large Numbers

• Basic Statistical Laws:


1. Law of Statistical Regularity:- It states that a
reasonably large number of items selected at
random from a large group of items, will on the
average represent the characteristics of the group.
2. Law of Inertia of Large Number: It states that
large groups of data show high degree of stability
because there is a greater possibility that one side
are compensated by the extremes on the other
side.
3.
Central Limit Theorem
• Definition: The Central Limit Theorem states that when a
large number of simple random samples are selected from
the population and the mean is calculated for each then the
distribution of these sample means will assume the normal
probability distribution.
• In other words, the sample means will be normally
distributed when the mean and standard deviation of the
population is given, and large random samples are selected
from the population, irrespective of whether the population
is normal or skewed.
• Central Limit Theorem : If x1, x2, x3, …….. xn is a
random sample of size n drawn from any population
(having mean μ and variance σ2), then the distribution
sample mean (x) is normally distributed with mean μ and
variance σ2/n, provided n is sufficiently large, i.e. n→∞,
where μ and σ2 respectively are population mean and
variance.
Central Limit Theorem
• Symbolically the central limit theorem can be
explained as:
• When ‘n’ number of independent random
variables are given each having the same
distribution, then:
• X = X1+X2+X3+X4+…. +Xn, the mean and
variance of X will be:


Sampling Distribution
• Definition: The Sampling Distribution helps in
determining the degree to which the sample
means from different samples differ from each
other, and the population mean to determine the
degree of closeness between the particular sample
mean to the population mean.
• In other words, the sampling distribution
constitutes the theoretical basis of inferential
statistics that involves determining the extent to
which the sample statistic vary from each other
and the population parameter. Here, the sample
statistic is the sample mean, and the population
parameter is the population means.

You might also like