Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 39

Chapter 3

Organisation of Data
Organisation of Data
Organisation of data is the second
statistical tool under which data are
arranged in such a form that comparison
of masses of similar data may be
facilitated a further analysis may be
possible. The most popular way of
organization of data is classification of
data.
Meaning of Classification of Data
Classification is the of arranging data in various
groups or classes according to their characteristics.
There are two features of classification of data
1- Data are classified in various groups or classes.
2- The basis of classification of data is their
characteristics (resemblances and affinities).
Objectives of Classification:
a] To simplify complex data

b] To facilitate understanding

c] To facilitate comparison

d] To make analysis and interpretation easy.

e] To arrange and put the data according to their common characteristics.


Statistical SeriesSystematic
arrangement of statistical
data

Raw data: Data collected in original or


crude form.
Series: Arranging of raw data in
different classes according to a given
order or sequence is called series.
Conversion of Raw Data into
Series

1. Individual Series without frequency


2. Frequency series or Series with
frequencies.
1] Individual Series: The arrangement
of raw data individually without
frequency
TYPES OF CLASSIFICATION
OF DATA
⚫ Geographical classification of data
⚫ Chronological classification
⚫ Qualitative classification
⚫ Quantitative classification
GEOGRAPHICAL
CLASSIFICATION
When the data classified according to
geographical location or region (like,
states , cities , regions, Zones , areas
, etc ) It is called geographical
classification . For example , the
production of food grains in INDIA
may be presented state- wise in
following manner.
Geographical /Spatial
Classification
When data are classified according to basis
of place is known as geographical
classification of data
STATE- WISE ETIMATES OF PRODUCTION OF
FOOD GRAINS

S.NO. NAME OF STUDENTS TOTAL FOOD GRAINS


(Thousands Tones)

1 ANDHRA PREDESH 1093.93

2 BIHAR 12899.89

3 KARNATAKA 1834.78

4 PUNJAB 21788.20

5 UTTAR PRESDESH 41828.30


CHRONOLOGICAL
CLASSIFICATION
When data are observed over a period of time the type
of classification is known as chronological classification
( on the basis of its time of occurrence ) .
National income figures , annual output of wheat
monthly expenditure of a household daily consumptions
milk , etc . Are some examples of chronological
classification . For examples we may present the figures
of population (or production ,sales etc. ) as follows..... .
POPULATION OF INDIA 1941 TO
1991
S.NO. YEAR POPULATION IN
CRORES

1 1941 31.87

2 1951 36.11

3 1961 43.91

4 1971 54.82

5 1981 68.33
Qualitative Classification-
When data are classified on the basis of
quality is known as qualitative classification
of data.
QUALITATIVE
CLASSIFICATION
We may first divide the population to
male and female on the basis of the
attribute “ sex” each of this class may
be further subdivide into “literate and
‘illiterate’ on the basis of attribute
‘literacy’ further classification can be
made on the basis of same other
attribute , say , employment .
QUANTITATIVE CLASSIFICATION
When data are classified on the basis of quantity is
known as quantitative classification of data.
WEIGHT (Kg) NO. OF STUDENTS

40-50 60

50-60 50

60-70 28

70-80 20

80-90 12

90-100 170
Characteristics of a Good
Classification

1.Comprehensiveness
2.Clarity
3.Homogeneity
4.Suitability
5.Stability
6.Elastic
Concept of Variable
Types of Variable

Discrete variable Continuous Variable


Concept of Variable
A characteristic or a phenomenon which is capable of being
measured and changes its value overtime is called variable.

A) Discrete Variable
Discrete variables are those variables that increase in jumps
or in complete numbers. (No fraction is possible)
Eg. Number of students in a class, Number of cars in a show
room etc. (1,2, 10,or 15 etc.)
B) Continuous Variables
Variables that assume a range of values or increase not in
jumps but continuously or in fractions are called continuous
variables.
Eg. Height of the boys –5’1’’ , 5’3’’ and so on, Marks in any
range 0-10, 10-15, 15-20
Meaning of Discrete Variable
Which are measured in complete numbers
like numbers of students, teachers, office
staff etc.
EMPLOYEE/ NO
STUDENTS
STUDENTS 500

TEACHERS 32

OFFICE STAFF 6

D-GROUP EMPLOYEE 4
Meaning of Continuous
Variable
Which are not measured in complete
numbers always like height in meter,
weight
Heightin
inKg
Cm etc. No of Students

110-120 10

120-130 12

130-140 11

140-150 8
Types of Statistical Series-
Types of Series

Individual Series Frequency Distribution


Series
Individual Series can be expressed in
two ways.
a] According to Serial Numbers

Roll no. Marks


1 30
2 25
3 15
4 30
5 25

b] Ascending or descending order.


In ascending order, smallest number is placed first
In descending order, the highest number is placed first.
Individual Series-
Individual Series is that series in which items
are occurred single time.
Serial No Value

1 10
2 15
3 18
4 20
5 22
Frequency Distribution
Series-
Types Frequency Distribution
Series-

Discrete Series Continuous Series


Discrete Series or Frequency
Array-
Discrete series in which data are presented
in a way that exact measurement of items
are clearly shown.
Value Frequency
10 4
11 6
12 6
13 4
14 3
Continuous Series-
It is that series in which item cannot be
exactly measured; they are placed in a
class. Class Frequency
Interval
0-10 10
10-20 15
20-30 20
30-40 18
40-50 15
50-60 9
Types of Continuous Series-
1- Exclusive Series
2- Inclusive Series
3-Open End Series
4- Mid Value Frequency Series
5- Cumulative frequency series
In constructing continuous series we come across
terms like:
a] Class : Each given internal is called a class e.g.,
0-5, 5-10.
b] Class limit: There are two limits upper limit and
lower limit.
c] Class interval: Difference between upper limit and
lower limit.
d] Mid-point or Mid Value: (Upper limit +Lower limit)/
2
e] Frequency: Number of items [observations] falling
within a particular class.
1 Exclusive Series
It such types of statistical series in which upper
limit of a class are the lower limit of just next
class
Obtained No of
Marks Students
00-10 8
10-20 9
20-30 10
30-40 9
40-50 8
2 Inclusive Series
It is such types of statistical series in which all
frequencies of class are included in the same class.

C.I. Frequency
1-10 8
11-20 9
21-30 10
31-40 9
41-50 8
Conversion of Inclusive into
Exclusive Series

C.I Frequency C.I. Frequency


1-10 8 0.5-10.5 8

11-20 9 10.5-20.5 9
21-30 10 20.5-30.5 10
31-40 9 30.5-40.5 9
41-50 8 40.5-50.5 8
Open End Series
The lowest value of highest value of
the distribution are not defined.

Obtained Marks No of Students


Below 10 8
10-20 9
20-30 10
30-40 9
Above 40 8
Mid Value Frequency Series
The class interval are not given only mid
values and their respective frequencies are
given.Mid Value Frequency
5 8
15 9
25 10
35 9
45 8
Cumulative
Series

Types of Cumulative Frequency


Series

Less Than More Than


iii] Cumulative Frequency Series:
It is obtained by successively adding the frequencies of
the values of the classes according to a certain law.
a] ‘Less than’ Cumulative Frequency Distribution :

The frequencies of each class-internal are added


successively.
b] ‘More than’ Cumulative Frequency Distribution:

The more than cumulative frequency is obtained by


finding the cumulative totals of frequencies starting
from the highest value of the variable to the lowest
value.
Less Than Cumulative
Frequency Series
C.I Frequency Less Than No of
Items

00-10 8 10 8

10-20 9 20 8+9=17
20-30 10 30 17+10=27
30-40 9 40 27+9=36
40-50 8 50 36+8=44
More Than Cumulative Frequency
Series
C.I Frequency More No of
Than Items

0 44
00-10 8
10 44-8=36
10-20 9
20 36-9=27
20-30 10
30 27-10=17
30-40 9
40 17-9=8
40-50 8
50 8-8=0
LOSS OF INFORMATION
•The frequency distribution summarizes the raw data by
making it concise and comprehensible. However, it does not
show the details that are found in your data and leads to loss
of information.
•When the raw data is grouped into classes, an individual
observation has no significance in further statistical
calculations.
•For example, suppose class 10-20 contains 6 values: 12,
15,16,18,14,19. When such data is grouped as a class 10-20,
study material and individual values have no significance and
only frequency, i.e.6 is recorded and not their actual values.
•Statistical calculations are based only on the values of the
class mark instead of the actual values. As a result, it leads to
considerable loss of information.
FREQUENCY ARRAY

•A discrete variable, the classification of its data is known as a


frequency array.

Univariate frequency distribution –

When data is classified on the basis of a single variable are


known as univariate frequency distributions. And one way
frequency distribution.

Bivariate frequency distribution –

When data is classified on the basis of two variables such as


height and weight, marks in statics and economics etc., the
distribution is known as bivariate frequency distribution or
two way frequency distribution.

You might also like