Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

ECONOMICS

COLLECTION, PRESENTATION AND ORGANIZATION OF DATA

DIVYA DHAWAN
B. ED, II YEAR, SECTION-C, JMI
ECONOMICS
WHAT IS DATA?

Data is recognized as statistics and facts gathered and arranged in a systematic order for a study. It helps users find
the relevant information. Scientists, economists, statisticians, and other individuals often store data for examining
and evaluating.

Data is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other
basic units of meaning, or simply sequences of symbols that may be further interpreted.
WHAT IS COLLECTION OF DATA ?

The collection of data aims to collect evidence for attaining a sound and comprehensible solution to a
problem. To understand the inconsistencies in the output, we need the ‘data’ on the generation. It is a
process which is conducted to measure and gather information. ‘Data’ is a device, which aids in the
comprehension of problems by providing knowledge.

WHAT ARE THE SOURCES OF DATA?


Statistical data can be obtained from two sources:

 Primary data
 Secondary data

Primary Data
 The enumerator (person who assembles the data) may collect the data by administering an
inquiry or research. Such data is called Primary Data, as it is formulated on first-hand
information.
 Primary data are unique, do not require any modification, and are costly.

Secondary data
 If the data has been examined and analyzed by another agency, they are called Secondary Data.
Usually, the issued data are secondary.
 They are already in the presence and therefore are not unique.
 It demands to be modified to satisfy the aim of the study at hand.
 Secondary data are low priced.

Primary Data v/s Secondary Data

 The term primary data refers to the data originated by the researcher for the first time.
Secondary data is the already existing data, collected by the investigator agencies and
organisations earlier.
 Primary data is a real-time data whereas secondary data is one which relates to the past.
 Primary data is collected for addressing the problem at hand while secondary data is
collected for purposes other than the problem at hand.
How do we collect Data?

Collection of data is important and. It is done by the following ways:

Surveys

 The survey aims to describe characteristics like cost, worth, utility (in case of the product)
and reputation, honesty, loyalty (in case of the nominee).
 The objective of the survey is to gather data and is a method of gathering information from
individuals.

Preparation of Instrument

The most prevalent type of tool employed in surveys is a questionnaire/ interview schedule. The
questionnaire is either self-directed by the interviewee or conducted by the enumerator or
qualified investigator. While drawing-up the questionnaire/interview schedule, the following points
should be kept in mind:

 The questionnaire should not be lengthy.


 The array of problems should move from indefinite to distinct.
 Questions should not be enigmatic.
 Questions should not use binary negatives.
 Questions should not be leading.
 Questions should not indicate choices.

Mode of Data Collection

The aim of probing questions is to survey the acquisition of data. There are three ways of collecting data:

1. Personal Interviews
2. Mailing (questionnaire) Surveys
3. Telephone Interviews
Personal Interviews

In this method, the researcher has the main role as he/she conducts the interviews face to face with
the respondents.Personal interviews are preferred due to various reasons:

 Highest Response Rate


 Allows use of all types of questions
 Better for using open-ended questions
 Allows clarification of ambiguous questions. The

personal interview has some demerits too:

 Most expensive
 Possibility of influencing respondents
 More time taking

Mailing Questionnaire

In such a method, the data is collected through mail. The questionnaire is mailed to each person and
a request is attached to complete and return it on time.

The advantages of this method are:

 Least expensive
 The only method to reach remote areas
 No influence on respondents
 Maintains anonymity of respondents
 Best for sensitive questions

The disadvantages of mail survey are:

 Cannot be used by illiterates


 Long response time
 Does not allow an explanation of unambiguous questions
 Reactions cannot be watched
Telephone Interviews

In telephone interviews, the investigator asks questions over the telephone. The

advantages of telephone interviews are:

 Relatively low cost


 Relatively less influence on respondents
 Relatively high response rate.

The disadvantages of this method are:

 Limited use
 Reactions cannot be watched
 Possibility of influencing respondents
ORGANIZATION OF DATA

Tests, experiments and survey studies in education and psychology provide us valuable data, mostly in
the shape of numerical scores. These data in their original form, haYe little meaning for the
investigator or reader. For understanding the meaning and deriving useful conclusion, the data have to
be organized or arranged in some systematic way. The organization and arrangement of original or
computed statistics in a proper way for deriving useful interpretation is termed organization of data. In
general, this task can be carried out

in the following ways: ·

1. Organization ·in the form of statistical tables

2. Organization in the form of Rank order

3. Organization in the form of Frequency Distribution

Classification
Classification is the process of arranging things in groups or classes according to their
resemblances and affinities and gives expression to the unity of attributes that may exist amongst a
diversity of individuals.

Objectives of Classification
 Simplification and Briefness
 Utility
 Distinctiveness
 Comparability
 Scientific arrangement
 Attractive and effective

Characteristic of a Good Classification

 Comprehensiveness
 Clarity
 Homogeneity
 Suitability
 Stability
 Elastic

Basis of Classification

 Geographical Classification This classification of data is based on the


geographical or locational differences of the data.
 Chronological Classification When data are classified on the basis of time, it is known as
chronological classification.
 Qualitative Classification This classification is according to qualities or attributes of the
data.
This classification may be of two types
o Simple classification
o Manifold classification
 Quantitative or Numerical Classification Data are classified in to classes or groups on the
basis of their numerical values. Quantitative classification is also called classification by
variables.
 Concept of Variable: A characteristic or a phenomenon which is capable of being
measured and changes its value overtime is called a variable.
The variable may be either discrete or continuous
o Discrete Variable These are those variables that increase in jumps or in
compete numbers.
o Continuous Variable Variable that assume a range of values or increase
not in jumps but continuously or in fractions are called continuous
variables.
 Raw Data A mass of data in its crude form is called raw data
Statistical Series :-

Systematic arrangement of statistical data

I. Can be on the basis of individual units :- The data can be individually presented in two forms:

i] Raw data : Data collected in original form.


ii] Individual Series : The arrangement of raw data individually. It can be expressed in two ways.
a] Alphabetical arrangement : Alphabetical order b]
Array : Ascending or descending order.

II. Can be on the basis of Frequency Distribution :-Frequency distribution refers to a table in which
observed values of a variable are classified according to their numerical magnitude.

1. Discrete Series :-

A variable is called discrete if the variable can take only some particular values.

2. Continuous Series :-

A variable is called continuous if it can take any value in a given range. In constructing continuous series
we come across terms like:

a] Class : Each given internal is called a class e.g., 0-5, 5-10.


b] Class limit : There are two limits upper limit and lower limit. c] Class
interval : Difference between upper limit and lower limit. d] Range :
Difference between upper limit and lower limit.
e] Mid-point or Mid Value :

f] Frequency : Number of items [observations] falling within a particular class.

i] Exclusive Series : Excluding the upper limit of these classes, all the items of the class are included
in the class itself. E.g., :

ii] Inclusive

Marks 0-10 10-20 20-30 30-40


Number of Students 2 5 2 1

Series : Upper class limits of classes are included in the respective classes. E.g.,

Marks 0-9 10-19 20-29


Name of Students 2 5 2

Open End Classes :

The lower limit of the first class and upper limit of the last class are not given. E.g.,

Marks Below 20 20-30 30-40 40-50 50 and above


Number of Students 7 6 12 5 3

iii] Cumulative Frequency Series : It is obtained by successively adding the frequencies of the values of
the classes according to a certain law.
a] ‘Less than’ Cumulative Frequency Distribution :The frequencies of each class-internal are added
successively.
b] ‘More than’ Cumulative Frequency Distribution : The more than cumulative frequency is obtained by
finding the cumulative totals of frequencies starting from the highest value of the variable to the lowest
value.
PRESENTATION OF DATA

As soon as the data collection is over, the investigator needs to find a way of presenting the data in a
meaningful, efficient and easily understood way to identify the main features of the data at a glance
using a suitable presentation method. Generally, the data in
the statistics can be presented in three different forms, such as textual method, tabular method and
graphical method.

Presentation of Data Examples

Now, let us discuss how to present the data in a meaningful way with the help of examples. Example 1:

Consider the marks given below, which are obtained by 10 students in Mathematics: 36, 55, 73,

95, 42, 60, 78, 25, 62, 75.

Find the range for the given data. Solution:

Given Data: 36, 55, 73, 95, 42, 60, 78, 25, 62, 75.

The data given is called the raw data.

First, arrange the data in the ascending order: 25, 36, 42, 55, 60, 62, 73, 75, 78, 95. Therefore, the

lowest mark is 25 and the highest mark is 95.

We know that the range of the data is the difference between the highest and the lowest value in the
dataset.

Therefore, Range = 95-25 = 70.

Note: Presentation of data in ascending or descending order can be time-consuming if we have a larger
number of observations in an experiment.

Now, let us discuss how to present the data if we have a comparatively more number of observations in
an experiment.

Example 2:
Consider the marks obtained by 30 students in Mathematics subject (out of 100 marks)

10, 20, 36, 92, 95, 40, 50, 56, 60, 70, 92, 88, 80, 70, 72, 70, 36, 40, 36, 40, 92, 40, 50, 50,
56, 60, 70, 60, 60, 88.

Solution:

In this example, the number of observations is larger compared to example 1. So, the presentation of
data in ascending or descending order is a bit time-consuming. Hence, we can go for the method called
ungrouped frequency distribution table or simply frequency distribution table. In this method, we can
arrange the data in tabular form in terms of frequency.

For example, 3 students scored 50 marks. Hence, the frequency of 50 marks is 3. Now, let us construct the
frequency distribution table for the given data.

Therefore, the presentation of data is given as below:

Marks Frequency (Number of


students)
10 1
20 1
36 3
40 4
50 3
56 2
60 4
70 4
72 1
80 1
88 2
92 3
95 1
Total 30

The following example shows the presentation of data for the larger number of observations in an
experiment.
Example 3:

Consider the marks obtained by 100 students in a Mathematics subject (out of 100 marks) 95, 67, 28,

32, 65, 65, 69, 33, 98, 96,76, 42, 32, 38, 42, 40, 40, 69, 95, 92, 75, 83, 76, 83,
85, 62, 37, 65, 63, 42, 89, 65, 73, 81, 49, 52, 64, 76, 83, 92, 93, 68, 52, 79, 81, 83, 59, 82,
75, 82, 86, 90, 44, 62, 31, 36, 38, 42, 39, 83, 87, 56, 58, 23, 35, 76, 83, 85, 30, 68, 69, 83,
86, 43, 45, 39, 83, 75, 66, 83, 92, 75, 89, 66, 91, 27, 88, 89, 93, 42, 53, 69, 90, 55, 66, 49,
52, 83, 34, 36.

Solution:

Now, we have 100 observations to present the data. In this case, we have more data when compared to
example 1 and example 2. So, these data can be arranged in the tabular form called the grouped
frequency table. Hence, we group the given data like 20-29, 30-39, 40- 49, ….,90-99 (As our data is
from 23 to 98). The grouping of data is called the “class interval” or “classes”, and the size of the
class is called “class-size” or “class-width”.

In this case, the class size is 10. In each class, we have a lower-class limit and an upper-class limit. For
example, if the class interval is 30-39, the lower-class limit is 30, and the upper- class limit is 39.
Therefore, the least number in the class interval is called the lower-class limit and the greatest limit in
the class interval is called upper-class limit.

Hence, the presentation of data in the grouped frequency table is given below:

Class Interval (Marks) Frequency ( Number of


students)
20 – 29 3
30 – 39 14
40 – 49 12
50 – 59 8
60 – 69 18
70 – 79 10
80 – 89 23
90 – 99 12
Total 100

Hence, the presentation of data in this form simplifies the data and it helps to enable the
observer to understand the main feature of data at a glance.
CONCLUSION

Data collection tools are a key to analyzing the data that has been collected over the course of a test or
trial phase. Using a mix or quantitative and qualitative metrics will give a broad range of data to be
considered. The tools selected will autonomously calculate a
continuous metric result for spot checking. Defining the resources needed is very critical to the success of
any effort to systematically improve the processes at an organization. Building and selecting tools that
are designed keeping the data in mind will prove very valuable during the data collection and analysis
phase of the test.

You might also like