Organisation of Data

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

ORGANISATION OF DATA

OBJECTIVES -

Studying this chapter should enable you to:


1. Classify the data for further statistical analysis.
2. Distinguish between quantitative and qualitative classification.
3. Prepare a frequency distribution table.
4. Know the technique of forming classes.
5. Be familiar with the method of tally marking.
6. Differentiate between univariate and bivariate frequency distributions.
CLASSIFICATION OF DATA
Organisation of data means the arrangement of figures in such a form that comparison of the mass of similar data may be
facilitated and further analysis may be possible. An important method of organisation of data is to distribute these into
different classes on the basis of their characteristics. This process is known as classification of data.
Classification is the grouping of related facts into classes. Unorganised and shapeless data can neither be easily compared
nor interpreted. The technique of arranging the data in different homogeneous groups is known as classification.
According to Horace Secrist, "Classification is the process of arranging data into sequences and groups according to their
common characteristics of separating them into different but related parts.”
OBJECTIVES OF CLASSIFICATION
• To make the data simple and brief i.e. to condense the mass of data arranged in such a way that it easily
understandable.
• To bring out points of similarities and dissimilarities. Classification reveals clearly the points of similarities and
dissimilarities in the statistical data. For example, married and unmarried; employed and unemployed passed and
failed, etc.
• To facilitate comparison. It facilitates comparison. Unorganised and shapeless data cannot be compared. It can be
done with the help of classification. For example, the figures of weights of students of two classes are given below.

Now we can make comparison of weights of students.


• Scientific and critical arrangement- With the help of classificatiaon, data can be presented scientifically. For
example, the number of students can be classified on the basis of their age and class.
• To reveal the basis of tabulation – Classification provides basis for tabulation. No tabulation is possible without
classification.
TYPES OR METHODS OR BASIS OF CLASSIFICATION
Facts in one class differ from those of another class with respect to some characteristics called a basis of classification. The
usual basis of classfication are as follows:
l. Geographical Classification.
2. Chronological Classification.
3. Conditional Classification.
4. Qualitative Classification.
5. Quantitative Classification.
1. GEOGRAPHlCAL CLASSIFICATION (SPATIAL CLASSIFICATION)
When the data is classified according to certain geographical locations, then such a classification is called geographical
classification.
For Eg, When the students are grouped according to the colleges they belong, then it is geographical classification. When the
quantity of wheat produced in different districts of a state is presented, we say that it is according to geographical
classification.

CHRONOLOGICAL CLASSIFICATION
When data are classified with respect to different periods of time the type of classification is known as chronological
classification.
In such a classification data is classified in ascending or descending order with reference to time such as years, months,
weeks etc. for eg:-

CONDITIONAL CLASSIFIFICATION
When data are classified with respect to condition the type of classification is known as conditional classification. For
example, the number of students in different faculties may be presented in the following manner

QUALITATIVE CLASSIFICATION
In qualitative classification data are classified on the basis of some qualitative phenomenon.
When people are grouped as employed and unemployed, with respect to a single attribute ’employment’, then this type of
classification is known as simple classification(also know as stage I)
When classification is done on the basis of two attributes, such as employed (employed and unemployed) and sex (males
and females) then this type of classification is called two-fold or dichotomous classification (II stage).
If we further divide the data on the basis of some more attributes so as to form several classes then classification is called
manifold classification. The grouping with respect to manifold classification can be given.

QUANTITATIVE CLASSIFICATION
When a statistical enquiry is conducted and one of the variables is recorded
one after the other then we get a group of numbers.. It is possible to arrange the data in the ascending or descending order
such as height, weight, age, income, etc. But this does not reduce the volume of the data.
The quantitative classification of number of students of different age groups in a school may be presented in the following
manner:

Age group No. of Students


0– 5 50
5 – 10 80
10 – 15 70
15 - 20 60
Total 260
VARIABLE
A characteristic which is capable of being measured and changes its value over time is called a variable. ln other words, a
variable refers to that quantity which is subject to change and which can be measured by some unit as weight, age, income,
etc. The variable can be divided into the following two groups-
(i) Discrete Variable (ii) Continuous Variable
(i) Discrete Variable refer to those variables which are exact or finite and are not expressed in fractions. For example,
number of students in a class, number of members in a family, number of rooms in a house, etc.
A discrete variable is based on certain limits. As in a family there are five children and another child is born then the
total would be six not 5 ¼, 5½, 5¾ etc.

(ii) Continuous Variable is also known as continuous random variable, it is capable of manifesting every conceivable
fractional value within the range of possibilities, such as height of the boys in a school is expressed as 5'3", 5'4", 5'5" and so
on. The values of continuous variable may take any fractional value between the two complete numbers.
ATTRIBUTES
The beauty of people, their intelligence and aptitude for art and music also change from one person to the other. They
cannot be measured numerically in the same way as heights and weights. Therefore, they are not called variables in the
statistical sense. They are called attributes.
In brief, it can be said that variable implies the quantitative character of an item, while attribute signifies the qualitative
character of an item.

RAW DATA
Raw data are those data which are collected by the investigation. They are in their original form. They are highly
disorganised. The investigator has to organise them in a classified form.
STATISTICAL SERIES
Statistical series refer to those data which are presented in some order and sequence. Statistical series can be classified as:-
i. Individual Series
ii. Discrete Series
iii. Continuous or Class interval Series
INDIVIDUAL SERIES OR INDIVIDUAL OBSERVATION

A series of individual observations is a series in which items are listed individually.


It is a series without class and frequencies
If marks of six students are individually given, the series will form a series of individual observations. In this series there is
no class of the items.
An example of individual series is given below:

Individual series may be represented in two ways:-

1. According to serial numbers- When data is arranged as per serial number for eg as per ROLE NUMBERS of students
in a class.
2. By using ARRAY- When the data in Individual series is presented in ascending or descending order it is known as
array.

DISCRETE SERIES
When items are arranged in groups showing definite breaks from one point to another and when they are exactly
measurable, they form a discrete series. There is no class in such series.
Formation of discrete frequency distribution –

Formation of discrete frequency distribution is very simple. The number of times an item repeats itself corresponding to
range of value is called frequency.
To find frequency of a particular item, we make use of tally bars. The marks of 30 students in a class are like this:
2,3,5,4,8,6,4,9,5,4,6,5,3,2,3,2,2,8,7,6,5,4,6,9,7,6,4,2, 3, 2
Marks
obtained Tally bars Frequency (Complete it)
2 ####### 6
3
4
5
6
7
8
9

CONTINUOUS SERIES OR CLASS-INTERVAL SERIES

- When items are arranged in groups or classes but they are not exactly measurable, they form a continuous series.
- In the continuous series, frequency of various classes are shown against them.

Marks Obtained No. of students (frequency)


0 – 10 5
10 – 20 7
20 – 30 13
30 – 40 20
40 – 50 11
50 – 60 8
60 – 70 6

The difference between individual, discrete and continuous series is given below:
1. In individual series there is always one frequency for each item while in case of discrete and continuous the frequency for
each item is more than one.
2. In individual series there is no column for frequency while in discrete and continuous series there are columns for both
size and frequency.
3. In individual and discrete series values are given in definite break while in continuous series values are given in the form
of groups.

FORMATION OF CONTINUOUS FREQUENCY DISTRIBUTION


Some important terms-
Frequency :- The number of observations corresponding to particular class is called the frequency of that class.
Class-limits or class-intervals :- The class-limits are the lowest and highest values that can be included in the class.
L1 – Lower limit, L2 – Upper limit.
Magnitude of the class-interval or size of interval (i) :- The difference between Upper Limit and Lower Limit of a class is
called class-interval of that class.
i = L2 – L1
Mid-point or mid-value or central-value:- The centre of the limits of a class is called the mid-point or mid-value of a class.
lt is the value lying half-way between the lower and upper class limits of a class-interval. Mid-point of a class is found as
follows –
M.V. = L1 + L2 / 2
HOW TO PREPARE A FREQUENCY DISTRIBUTION?
While preparing frequency distribution, the following four questions should be addressed:-
1. How many classes we should have?
2. What should be the size of each class?
3. How we should determine the class limits?
4. How should we get the frequency for each class?
How many classes we should have?
First find out the change in values of variable in hand, such variations in the value of variables are captured in range.
The range is the difference between the largest and the smallest values of a variable.
After obtaining the value of range it becomes easier to determine the number of classes once we decide the class interval.
The rule of thumb often used is that the number of classes should be between 5 and 15.

OTHER STATISTICAL SERIES

EXCLUSIVE AND INCLUSIVE SERIES


(a) Exclusive series :- * When the class-intervals are so fixed that the upper limit of one class-interval is the lower limit of
the next class-interval, it is called an exclusive series.
* It excludes all items corresponding to its upper limit
Following are the examples of exclusive series.
(i) Class-interval Frequency
0- 5 2
5 - l0 6
10 – 15 9
15 – 20 3
(ii) Exceeding 5 but not exceeding 10
(iii) 10 and above but below 20
(iv) 10 and under 20

(v) 10 – 19.9

19.9 - 29.9

INCLUSIVE SERIES

An inclusive series is that series which includes all items upto its upper limit. Both the lower and upper limit of a
classinterval are included in the class.
Weekly wages No. of workers
40 – 49 7
50 – 59 17
60 – 69 25
70 – 79 10
* There is a gap between the upper limit of class interval and the lower limit of the next class interval, The gap ranges
between 0.1 to 1
CONVERSION OF INCLUSIVE SERIES INTO EXCLUSIVE SERIES
- First we find the difference between the upper limit of a class interval and the lower limit of a class interval and the
lower limit of the next class interval
10 – 14
15 – 19 = 1÷ 2 =.5
20 – 24
- Half of that difference is added to the upper limit of each class interval and remaining half is deducted from the lower
limit of class interval.
10.5 – 14.5
14.5 – 19.5
NON CUMULATIVE OR SIMPLE SERIES AND CUMULATIVE SERIES
NON CUMULATIVE- In a non cumulative series frequencies corresponding to each class interval is shown separately and
individually.
Class interval simple frequency
0 - 10 3
10 - 20 5
20 - 30 10
30 - 40 7

CUMULATIVE SERIES – In a cumulative frequency series the frequencies are progressively totalled and aggregates are
shown.

Note – Recognition of Cumulative series, if word like more than, above, below, over, under ,up to, exceeding, not
exceeding before all the limits of class interval than it will be cumulative series.
CONVERSION OF CUMULATIVE FREQUENCY SERIES INTO SIMPLE FREQUENCY SERIES

TO FORM CLASS INTERVAL WHEN MID VALUES ARE GIVEN

- If Mid Values and frequencies are given Class Interval will be formed on the basis of Exclusive method.
- Difference of first and second Mid Value will be taken and divided by 2.
- Half of the difference will be deducted from the first Midvalue it will give lower limit, adding half of the difference
to first Mid value will give upper limit.

You might also like