Download as pdf or txt
Download as pdf or txt
You are on page 1of 94

Collection of Data

Statistics
Class - XI
The students will be able to :

To define the key terms


To differentiate between primary and
secondary data.
Synopsis
1. Key terms
2. Sources of data
3. Surveys
4. Census or complete Enumeration
5. Sample survey
6. Questionnaire
7. Types of questions
8. Open ended Vs Close ended
9. Mode of data collection
10. Pilot survey
11. Methods of sampling
12. Important agencies
Key Terms

Variable: The values which change, such as production of food grains


per annum, temperature of a city, etc. They are represented by the
letters X,Y or Z.

Observation: The value of a variable.


Data: Observations corresponding to different variables.

Statistical Investigation: It means search for information conducted by


using statistical methods.
KeyTerms
Investigators: The person who
conducts statistical investigation.

Enumerators: A person who helps


investigator in the collection of data.

Respondents: The persons from


whom statistical information is
collected.

Statistical Unit: The items on which


measurements are taken. Example;
weight in kgs.
Key Terms
➢ Population or the Universe: it
means totality of the items under
study.
➢ Sample: It refers to a group or
section of the population from
which information is to be
obtained.
➢ Good sample: It is smaller than
the population and is capable of
providing accurate information
about the population at lower
cost and lesser time.
Classification of data
Statistical data may be classified as primary and secondary

Types

Primary Data SecondaryData


Choice of methods

The investigator must decide the choice between the


two methods-

➢ Primary Data

➢ Secondary Data
DEFINITIONS

“Data originally collected in the process of


investigation are known as primary data.”-
Wessel

“Data collected by other persons are called


secondary data.”- Wessel
Sources of Data
1. Primary Data: when the enumerator collect the data by
conducting an enquiry or an investigation. They are based on
first hand information.
For example, you will have to enquire from a large number
of school students, by asking questions from them to collect
the desired information.
Primary source of collecting data offers Primary Data.

2. Secondary Data: When the data have been collected and


processed by some other agency. It is based on second hand
information.
For example, information obtained from published sources
such as government reports, newspaper.
Secondary source of collecting data offers you secondary data.
DIFFERENCES BETWEEN
PRIMARY AND SECONDARY
DATA

• Difference in originality
• Difference in objective
• Difference in cost of collection
1. Differentiate between primary and secondary
data.
Day 2 – 2periods
The students will be able to :
To explain the methods of collecting primary data.
The methods of data collection depends upon a
number of factors-

➢ Object and nature of enquiry


➢ Availability of financial resources
➢ Availability of time
➢ Accuracy required
➢ Collecting agencies
Primary Data

✓ The data which are collected from the field under the control
and supervision of an investigator.

✓ This type of data are generally afresh and collected for


the first time.

✓ They are original in character


For the collection of primary data, the investigator must
choose any of the following methods-

➢ Direct personal observation


➢ Indirect oral interview
➢ Information through agencies
➢ Mailed questionnaires
➢ Schedules sent through enumerator
➢ Telephonic Interviews
➢ Direct personal observation:

✓ The data is collected by the investigator personally, he/she


must be a keen observer, tactful and courteous in behavior.
✓ Investigator directly establishes direct relation with the
persons from whom the information is to be obtained.
✓ The investigator should be diligent, efficient, impartial and
tolerant.
✓ Example: Direct contact with workers of an industry to
obtain information about their economic conditions.
➢ Direct personal observation:

Suitability:

Direct personal observation is adopted in the following cases:

✓ Where greater accuracy is needed


✓ Where the field of enquiry is not large
✓ Where confidential data are to collected
✓ Where sufficient time is available
✓ When direct contact with the respondents is required
➢ Direct personal observation:
Merits:
✓ Original data are collected
✓ True and reliable data can be had
✓ Response will be more encouraging, because of personal
approach

✓ A high degree of accuracy can be aimed

Demerits:

✓ It is unsuitable where the area is large


✓ It is expensive and time-consuming
✓ An untrained investigator will not bring good results
✓ One has to collect information according to the
convenience of the informant
➢ Indirect oral interview

✓ The investigator approaches the witness or third parties, who


are in touch with the informant.
✓ The enumerator interviews the people, who are directly or
indirectly connected with the problem under the study.
✓ Generally this method is employed by different enquiry
committees and commissions .
✓ The police department generally adopts this method to get clues
of thefts, riots , murders, etc.
➢ Indirect oral interview

Suitability:

✓ It is more suitable when the area to be studied is large.

✓ It is used when direct information cannot be obtained.

✓ This system is generally adopted by governments.


➢ Indirect oral interview
Merits
✓ It is simple and convenient.
✓ It saves time, money and labor.
✓ It can be used in the investigation of a large area.
✓ Adequate information can be had.
Demerits
✓ The information cannot be relied because of absence of
direct contact.
✓ Interview with an improper m a n will spoil the results.
✓ In order to get the real position, a sufficient no of people are
to be interviewed
✓ The careless attitude of the informant will affect the degree
of accuracy
➢ Information through agencies

✓ The local agents or correspondents will be appointed, they


collect the information and transmit it to the office or
person.

✓ They do according to their own ways and tastes.


✓ This system is adopted by newspapers, agencies, etc.,
when information is needed in different fields.
✓ The informants are generally called correspondents.

Suitability:
In those cases where the information is to be obtained at
regular intervals from a wide area
➢ Information through agencies
Merits

✓ Extensive information can be had.


✓ It is the most cheap and economical method.
✓ Speedy information is possible.
✓ It is useful where information is needed regularly.

Demerits
✓ The information may be biased.
✓ Degree of accuracy cannot be maintained.
✓ Uniformity cannot be maintained.
✓ Data may not be original.
Explain the methods of collecting primary data with
examples.
Day 2 – 2 periods
➢ The students will be able to :
➢ To explain the qualities of a good
questionnaire.
➢ To frame/ design a questionnaire
Questionnaire
The most common type of instrument used in surveys for
collecting primary data is questionnaire.
While preparing the same the following points are kept in mind

• It should not be too long


• The series of questions should move from general to specific
• The questions should be precise and clear
• The questions should not be ambiguous
• The question should not use double negatives
• The question should not be a misleading question
• The question should not indicate alternatives to the answer
Types of Questions in Questionnaire

1. Closed ended or structured questions: It could be a


two-way question or a multiple choice question. When
there are only 2 possible answers it is called a two way
question. When there is a possibility of more than 2
answers it is called multiple choice question.
2. Open ended or unstructured questions: It could be
descriptive types. When a person gets a chance to talk
more about any topic.
Differentiate Close Ended Versus Open Ended

Closed Ended Question Open Ended Question


Easier to compare responses Detailed and qualifies responses
Quicker and easier answers Unlimited possible answers
Easy to interpret Difficult to interpret
Easy to score Difficult to score
Easy to codify for analysis Difficult to codify for analysis

Example: What is your view about


Example: Do
Doyou
you smoke?
like to play cricket? globalization?
ENUMERATOR’S METHOD
• Enumerator approaches the respondent with questionnaire.
• The questionnaires which are filled by the enumerators themselves by
putting questions are called schedules.
SUITABILITY
• Field of investigation is large
• The investigation needs specialized and skilled investigators
• The investigators are well versed in the local language and cultural
norms of the respondents
MERITS
• Wide coverage
• Accuracy
• Personal contact
• Impartiality
• Completeness
DEMERITS
• Expensive
• Availability of enumerators
• Time consuming
• Not suitable for private investigation(only used by Govt)
• Partial
➢ Mailed questionnaires
✓ In this method, a questionnaire consisting of a list of questions
pertaining to the enquiry is prepared.
✓ The questionnaires is sent to the respondents, there are blank
spaces for answers.
✓ A covering letter is also sent along with the questionnaire,
requesting the respondent to extend their full cooperation by giving
the correct replies.
✓ This method is adopted by research workers, private individuals,
non-officials agencies and State and Central Governments.

Suitability:
This method is appropriate in cases where informants are
spread over a wide area
➢ Mailed questionnaires
Merits
✓ Of all the methods, the mailed questionnaire is the most
economical.
✓ It can be widely used, when the area of investigation is
large.

✓ It saves money, labor and time.


Demerits
✓ We cannot be sure about the accuracy and reliability of the
data.

✓ There is long delay in receiving questionnaires duly filled in.


Pilot Survey/ Pre- Testing
It is a trial survey which helps to test the effectiveness of
the questionnaire on a small group.

Importance:
1. it helps in pre-testing of the questionnaire, so as to
know the shortcomings and drawbacks of the
questions.

2. It also helps in accessing the suitability of questions,


clarity of instructions, performance of enumerators and
cost and time involved.
Mode of Data Collection
Design a questionnaire on the popularity
of veg noodles .
Day 4 – 2periods
➢ The students will be able to :
➢ To explain the meaning of sampling and census
➢ To list the sources of sampling
Secondary Data

Secondary data are those data which have been already collected
and analysed by some earlier agency for its own use and later the
same data are used by a different agency.

Sources of
Secondary Data

Published Sources Unpublished Sources


Secondary data are those data which
have been already collected and
analysed by some earlier agency for its
own use and later the same data are
used by a different agency.
Published sources:
Various governmental, international and local agencies publish
statistical data, and chief among them are:
✓ International publications: They are U. N. O, IM.F etc.
✓ Official publications of Central and State Govt .: Reserve B an k
of India Bulletin, Cen su s of India, Indian Trade Jo urn al, etc.

✓ Semi-Official publications: Semi-Govt. institutions like


Municipal Corporation, District Board, Panchayat, etc. publish
reports.
✓ Publications of Research Institutions: Indian Statistical
Institutions (I.S.I), Indian Council of Agricultural Research
(I.C.A.R) etc. publish the finding of their research programmes.
✓ Journals and Newspapers: Current and important materials on
statistics and socio-economic problems can be obtained from
journals and newspapers like, Economic Times, Commerce,
Indian Finance etc.
Unpublished sources:
There are various sources of unpublished data. They are
the records maintained by various government and private
offices, the researches carried out by individual research
scholars in the universities or research institutes.

We must take extra care when using secondary data.

According to Prof. Bowley “It is never safe to take published


statistics at their face value without knowing their meaning and
limitations and it is a lways necessar y to criticize arguments that can
be based on them.”
Precautions in the use of Secondary Data:

Before using the secondary data, the investigators should


consider the following factors:

✓ Suitability of the data

✓ Adequacy of the data

✓ Reliability of data
Census or Complete Enumeration
A survey which includes every element of the population. It covers
every individual unit in the entire population. The example
includes Census of India, which is carried out every ten years. This
surveys are carried for demographic data on birth, death, literacy.

Advantages Disadvantages
Results are absolutely correct, A lot of time, energy and money is
accurate and reliable required to collect data

Less chances of biasness Suitable for certain specific cases

Data related to each element is Large number of enumerators are


collected required for collecting data
Sample
Sample Survey
In this a sample from the population is surveyed. The first step is
selecting a sample to identify the population. Than select
Representative Sample, as it is difficult to study entire population.

Example: Population, research, etc.

Advantages Disadvantages
Economical as only some units are
Partial investigation of the universe
studied
Not easy to select a sample which
Not time consuming
represent whole population
Less efforts are required as small It is complicated process and
portion is studied difficult
Differentiate between Census and Sampling.
The students will be able to :
➢ Explain with examples the different methods of sampling
➢ To list the agencies of secondary data.
Methods Of Sampling
1. Random Sampling: Method where samples are selected at
random. In this method, every individual unit has an equal
chance of being selected. Methods under random sampling:
a) Lottery Method: In this method all the items in the
populations are assigned a distinct number and these are
written on identical piece of paper and put in a bowl.
Samples are selected on random.
b) Table of Random Numbers: In this random numbers are
arranged in rows and column which are selected on
population size.
c) Exit Polls: it is used to predict election results. In this
technique a random sample of voters, who exit from the
polling booths are asked who they voted for.
Methods Of Sampling
2. Non-Random Sampling: In this all the units of the
population do not have an equal chance of being selected.
Methods under this are:
a) Judgement/Purposive/Deliberate Sampling: Here
sample units are selected consciously by the investigator
on the basis of his judgement. This method is subject to
personal bias of investigator.
b) Quota Sampling: The population is divided into different
groups or classes according to different characteristics of
the population. The investigator selects the fixed number
of items from each group to frame a sample.
c) Convenience Sampling: Here the investigator collects the
sample units on the basis of his convenience.
➢ Systematic Sampling
➢ In this type of sampling, the first individual
is selected randomly and others are selected
using a fixed ‘sampling interval’. Let’s take a
simple example to understand this.

➢ Say our population size is x and we have to


select a sample size of n. Then, the next
individual that we will select would be
x/nth intervals away from the first
individual. We can select the rest in the
same way. Short cut method of random
sampling.
Suppose, we began with person number 3, and we want a
sample size of 5. So, the next individual that we will select
would be at an interval of (20/5) = 4 from the 3rd person,
i.e. 7 (3+4), and so on.

3, 3+4=7, 7+4=11, 11+4=15, 15+4=19 =


3, 7, 11, 15, 19

Systematic sampling is more convenient than simple


random sampling. However, it might also lead to bias if
there is an underlying pattern in which we are selecting
items from the population (though the chances of that
happening are quite rare).
Stratified Sampling
In this type of sampling, we divide the
population into subgroups (called strata)
based on different traits like gender, category,
etc. And then we select the sample(s) from
these subgroups:
Here, we first divided our population into subgroups based on
different colors of red, yellow, green and blue. Then, from each color,
we selected an individual in the proportion of their numbers in the
population.
We use this type of sampling when we want representation from
all the subgroups of the population. However, stratified sampling
requires proper knowledge of the characteristics of the population.
Cluster Sampling
In a clustered sample, we use the subgroups of the
population as the sampling unit rather than
individuals. The population is divided into subgroups,
known as clusters, and a whole cluster is randomly
selected to be included in the study:
Types of Non-Probability Sampling
Convenience Sampling
This is perhaps the easiest method of sampling because
individuals are selected based on their availability and
willingness to take part.

Here, let’s say individuals numbered 4, 7, 12, 15 and 20 want to


be part of our sample, and hence, we will include them in the
sample.
Convenience sampling is prone to significant bias, because the
sample may not be the representation of the specific
characteristics such as religion or, say the gender, of the
population.
Quota Sampling
In this type of sampling, we choose items based on
predetermined characteristics of the
population. Consider that we have to select
individuals having a number in multiples of four for
our sample:
Therefore, the individuals numbered 4, 8, 12, 16, and 20 are already
reserved for our sample.
In quota sampling, the chosen sample might not be the best representation
of the characteristics of the population that weren’t considered.
Judgment Sampling
It is also known as selective sampling. It
depends on the judgment of the experts when
choosing whom to ask to participate.

Suppose, our experts believe that people numbered 1, 7, 10, 15,


and 19 should be considered for our sample as they may help us to
infer the population in a better way. As you can imagine, quota
sampling is also prone to bias by the experts and may not
necessarily be representative.
Sampling and Non Sampling Errors
Error in statistics is used to denote the difference between the true
value and the estimated value. Errors can be classified as:
1. Sampling Errors: the difference between the actual value of a
parameter of the population( which is not known) and its
estimate( known). It is possible to reduce sampling error by
increasing the size of the sample.
2. Non- Sampling Errors: It includes:
• Errors in Data Acquisition: From recording incorrect
response
• Non-Response: It occurs when interviewer is unable to
contact person listed in the sample
• Sampling Bias: It occurs when in a sampling plan some
members of target population could not included
Important Agencies of Secondary Data

1.Census of India: It provides the


most important and complete
demographic record of population.
These are conducted every 10
years. The census official collects
information on various aspects of
population such as sex ratio,
literacy, migration, etc.
Which are used to interpret and
analyse to understand many
economic and social issues in India
and accordingly plans and policies
are formulated.
Important Agencies of Secondary Data
2.National Sample Survey
Organisation: it was established
in 1950 under the Ministry of
Finance to conduct surveys and
collect data on estimates of
literacy, school enrolments,
maternity, PDS, etc to publish
surveys through reports.
NSSO conducts continuous
surveys on various problems in
successive rounds.
With examples explain the difference between
random and non-random sampling.
QUESTIONNAIRE TOPIC

You want to research on the popularity of potato chips among children.


Design a suitable questionnaire for collecting this information.
Thank You

You might also like