2 - Data

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 94

STATISTICS-I

Chap-02: Collection, Organization & Presentation of Data


Contents:
Data
Types of data
Methods of Primary data collection
Sources of secondary data
Organization of data
Presentation of data
Data:
Definition:
Data is the plural form of datum. The raw materials of
research are known as data. Specifically, the measures
of characteristics of units or objects or individuals will
constitute data.

Example: Height, weight, body fitness of Cadets of


MCC.
Data:
Types:
a) On the basis of nature of characteristics
i) Qualitative data
ii) Quantitative data(Discrete, Continuous)

b) According to source
i) Primary data
ii) Secondary data
Data:
Primary data:
The data which are collected from the population or
sample units directly is called primary data.
Example: If the cadets personal information collected
from the cadets directly then it will constitute primary
data.
* Primary data is original in character, not well
organized, highly expensive w.r.t. money, time and
labor.
Data:
Secondary data:
The data which are already obtained by other persons
or organizations and are already published or utilized
is called secondary data.

Example: If the cadets personal information collected


from the respective form master or admin office then
the data will be secondary.
Methods of primary data collection:

a) Direct personal interview or face to face interview or schedule


method:
It is the most widely used method for primary data collection. In
this method the researcher ask questions the respondents directly
from a prior printed questionnaire.

* The data are more reliable and accurate


*The chance of having wrong information is less
*It is fruitful when the respondents are illiterate
# It is costly and more time consuming
Methods of primary data collection:

b) Indirect oral inquiry:


In this method the interviewer does not ask questions
the respondent rather he takes help from other person
to have the answers.

*It is needed when the respondents are dangerous like


terrorists, drug addicted, smugglers etc.
# The answer or information may not be reliable
Methods of primary data collection:

c) Mailed questionnaire method:


In this method the interviewer send a questionnaire to the
respondent through email or post mail. For post mail a self addressed
stamped envelope should be sent with the questionnaire.

*It is fruitful when there are some sensitive questions


* It is less costly for large scale survey
# The chance of having wrong information is more
# All answers may not be required
# It is applicable to educated respondents only
Methods of primary data collection:

d) Through local agents or schedule:


In this method the enumerator appoint some local agents or
correspondents in different areas to collect the information by printed
questionnaire called schedule. Govt., Non-govt. agencies, Newspaper
agencies, print and electronic media adopt this method.

* The chance of having wrong information is less


* It is fruitful when the area is large such as Census
* It is less time consuming
# The information may be biased
# Skilled and well-trained enumerators needed to get reliable data
Methods of primary data collection:

e) Telephone interview:
In this method a conversation is happened between the
interviewer and respondent through a telephone call.

* It is less time and cost consuming


* It is applicable for large no. of respondents spread over a
wide geographical area.
# Some information may be wrong
# Response rate is lower than direct personal interview
Methods of primary data collection:
f) Online interview:
In this method a survey is conducted through facebook,
twitter, video calling etc.

g) Through experiment:
In natural sciences like physics, chemistry, astronomy and
biological sciences like botany, zoology, biochemistry,
microbiology, pharmacy data are obtained when
treatments are applied on experimental units in a
controlled laboratory.
Sample questionnaire:
Title: ‘Applicable or not for Math Olympiad’
Title: ‘Teaching Proficiency’
Title: ‘Cadet’s Personality & Behavior’

*
Sources of secondary data :

a) Published sources:
 International publications, articles
 Govt., semi-govt., non govt. publications/records
 Official Statistics
 Trade and financial Journals
 Diaries, Books, Newspapers, magazines
 Websites, blogs
 International organizations like World Bank, WHO, IMF, ILO, UNDP
 Local organizations like BBS, BRRI, BARDEM, BRAC, ICDDRB

b) Unpublished sources: Private firms or business houses, institutions.


Primary data Vs secondary data :

 Primary data is more expensive than secondary data in


terms of time, money and labor.

 Secondary data may not be suitable, adequate and


reliable than primary data for the purpose of
investigation.

 Primary data is original in character but not well organized


than secondary data.
Selection of appropriate method:

 One must always remember that each method of data collection


has its importance, and none is superior in all situations.

 The secondary data may be used in case the researcher finds them
reliable, adequate and appropriate for his study.

 Finally, the most desirable approach w. r. t. the selection of the


method depends on the nature of the particular problem, time
and resources available along with the desired degree of accuracy.
Organization (Processing, Classification/tabulation of Data):

Processing: After collection, the data has to be processed


through editing and coding.

Editing:
Once the set of data have been collected, it is necessary to
process them for proper presentation. Editing of data is required
as preparatory work before the tabulation and statistical analysis
is carried out. This is quite a difficult job and requires a great
deal of skill and experience. While editing primary data the
following consideration need attention:
 The data should be complete
 The data should be consistent
 The data should be accurate
 The data should be homogenous

Coding:
When the data is to be processed by computer it must be coded
and converted into the computer language. For some qualitative
data, the code numbers can be assigned. For example, to a
question, “do you smoke?” a code of 1 can be assigned to the
answer ‘yes’ and a code of 0 can be assigned to the answer ‘no’.
Classification/tabulation of Data:
Classification/tabulation is a process of arranging the
available information into homogeneous groups within
some rows and columns according to similarities or
same characteristics.

Construction of a table:
A good table consists of the followings:
1) Table number 2) Title of the table, 3) Row heading or stub,
4) Column heading or caption, 5) Body of the table, 6)
Footnote( If needed).
Classification/tabulation of Data:

The data can be classified/tabulated into the following


groups.
a) Geographical classification/tabulation
b) Chronological classification/tabulation
c) Quantitative classification/tabulation (frequency distribution)
d) Qualitative classification/tabulation
Classification of Data:
a) Geographical classification:
In geographical classification the data are classified
on the basis of geographical areas.
Table-01: The number of COVID-19 patients of Bangladesh
classified according to different divisions are shown below

Division Dhaka CTG Sylhet Mymensingh Barishal Rajshahi Rangpur Khulna

No of 44000 30450 15200 23630 12325 25214 16482 23147


Patients
Classification of Data:
b) Chronological classification:
In Chronological classification the data are classified
on the basis of time.
Table-02: The number of COVID-19 patients of Bangladesh
in August, 2020 classified according to different time are
shown below
Date Aug 01 Aug 02 Aug 03 Aug 04 Aug 05 Aug 06 Aug 07 Aug 08

No of 1005 1245 1348 1578 1658 1875 2312 1925


Patients
Classification of Data:
c) Quantitative classification/frequency distribution:
In Quantitative classification the data are classified in
terms of magnitudes.
Table-03: The number of COVID-19 patients of Bangladesh in 11th
August, 2020 classified according to different ages are shown
below
Age 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
interval

No of 22 33 45 57 75 99 125 258
Patients
Classification of Data:
d) Qualitative classification:
In Qualitative classification the data are classified in
terms of attributes or categories.
Table-04: The number of COVID-19 patients of Bangladesh
classified according to different sex are shown below
Sex Male Female

No of Patients 145000 102000


Classification/tabulation of Data:

*** Classification/tabulation may be one way or two way


1) One way classification/simple table:
If the whole data are classified according to one characteristics.
Table-01: The number of COVID-19 patients of Bangladesh classified
according to different sex are shown below
Classification/tabulation of Data:
2) Two way classification/complex table:
If the whole data are classified according to two characteristics.
Table-02: The number of COVID-19 patients of Bangladesh classified
according to different sex and religion are shown below
Sex
Religion Male Female

Muslim 95000 75000

Hindu 35000 15000

Buddhist 13000 11000

Christian 2000 1000


Quantitative Classification/Frequency Distribution:

Frequency:
The repeated times of a value is called frequency of
that value.
For example: The age of 5 cadets are 24, 25, 24, 26, 25.
Here the frequency of three values 24, 25, 26 are 2, 2, 1
respectively.

*** The data with frequency is called grouped data and without
frequency is called ungrouped data.
Quantitative Classification/Frequency Distribution:
Frequency distribution:
It is a statistical table which shows the distribution of
whole data according to different classes.

For an example, the number of COVID-19 patients of Bangladesh in 11th


August, 2020 classified according to different ages are shown below
Frequency Distribution:
Types of Frequency distribution:
According to variable frequency distribution are of two
types such as
1) Discrete frequency distribution
2) Continuous frequency distribution

1) Discrete frequency distribution:


In this frequency distribution the whole data are
represented against the discrete variable.
Frequency Distribution:
For example, if we consider the number of cars in 500
families of Tangail town then we may have the following
frequency distribution:
Number of cars Number of families

0 250

1 125

2 75

3 50

Total 500
Frequency Distribution:
2) Continuous frequency distribution:
In this frequency distribution the continuous data are
represented in terms of some class interval.

For an example, the number of COVID-19 patients of Bangladesh in 11th


August, 2020 classified according to different ages are shown below:
Frequency Distribution:
Continuous frequency distribution can be constructed in
two ways such as
i) Exclusive method
ii) Inclusive method
i) Exclusive method:
Where upper limit of a class interval is not included in that class.
In below example the class interval is 10.
Frequency Distribution:
ii) Inclusive method:
Where upper limit of a class interval is included in that class.
Marks obtained No. of cadets
(Class Interval 10)
10 – 19 5
20 – 29 7
30 – 39 22
40 – 49 10
50 – 59 6
Total 50
Frequency Distribution:
Class intervals:
The difference between the upper limit and lower limit of a any class or
group is called class interval.
There are three types of class interval.
a) Equal class interval ( 10-20, 20-30, 30-40)
b) Unequal class interval (10-20, 20-25, 25-37)
c) Open-end class interval
Daily income(TK) Peoples
Less than 100 50
100 -150 75
Moe than 150 100
Frequency Distribution:
Construction of continuous frequency distribution:
The following are the important steps to construct
continuous frequency distribution.
Step-1: Determination of range
Range, R = highest value – lowest value

Step-2: Determination of number of classes


The number of classes should be in the range 5 and 25.
‘Sturges’ suggested the following formula for determining
approximate number of classes,
Frequency Distribution:

K = 1 + 3.322 ; N = no. of total observations


= An integer value

Step-3: Determination of class interval


The formula for determining class interval or width of a class is,

C = (should be integer)

As far as possible one should avoid class intervals such as 3, 4, 7, 11, 26,
etc. Preferably, one should have class intervals of either 5 or multiples of
Frequency Distribution:
Step-4: Determination of class limit
The starting point, i.e. the lower limit of the first class,
should either be zero or 5 or multiple of 5. For example, if
the lowest value of the series is 63 and we have taken a class
interval of 10, then the first class should be 60-70, instead of
63-73.
Step-5: Tally and frequency
Take each item from the data one at a time and put tally
mark ( ) against the class to which the item belongs. Count
the tally marks and place this number against the class to
which the items belong. Count the total frequency and check
Frequency Distribution:
Mathematical problem:
Construct a frequency table (exclusive method) by using suitable class
interval from the following obtained marks of 30 students:

34 36 31 46 76 86 42 44 32 46 40 54 66 56 50
42 33 80 77 81 46 40 60 63 64 76 56 57 57 70

Solution:
Here,
The maximum value = 86
The minimum value = 31
So range, R = 86 – 31
Frequency Distribution:
According to Sturges rule number of classes,
K = 1 + 3.322 ; N = 30
= 1 + 3.322 (30)
= 5.9069
= 6 (app)

Now the class interval , C

= 9.1667
Frequency Distribution:
Table: Frequency distribution by taking class interval as 10
Class Tally Frequency
(Marks Obtained) f
30 – 40 5
40 – 50 8
50 – 60 6
60 – 70 4
70 – 80 4
80 – 90 3
Total N = = 30
Frequency distribution
*Calculation of Less than and more than cumulative frequency(F), frequency density,
relative frequency, percentage frequency.(K=class interval)

Less than CF More than CF Frequency Relative Percentage


Frequency
Class /upward CF /down CF density frequency frequency
f
F F f/k RF= f/N RF×100

30 – 40 5 5 30 5/10=0.5 0.17 17

40 – 50 8 5+8=13 25 0.8 0.27 27

50 – 60 6 19 17 0.6 0.20 20

60 – 70 4 23 11 0.4 0.13 13

70 – 80 4 27 3+4=7 0.4 0.13 13

80 – 90 3 30 3 0.3 0.10 10

Total N = 30 1 100
Presentation of Data:

The organized data can be presented by diagrams or


graphs into two ways:

i) Quantitative presentation (Stem-leaf plot, Frequency curve,


frequency polygon, ogive curve, histogram)
ii) Qualitative presentation (Bar diagram, pie chart, historigram)
Presentation of Data:
i) Quantitative data presentation:

a) Stem-leaf plot:
It is a graphical technique of representing quantitative data that can be
used to examine to shape of the distribution, the range of the values and
point of concentration of the values. Each numerical value is divided into
two parts namely stem and leaf. Usually the stem is the first digit or digits
or integer part or any other suitable part of the observed values and leaf
is the trailing digit or decimal places.

Example: For observed value 27, 2 is stem and 7 is leaf


For observed value 127, 12 is stem and 7 is leaf
For observed value 12.7, 12 is stem and 7 is leaf
Presentation of Data:
Problem-1:
The following values are the obtained marks of 30 students:

34 36 31 46 76 86 42 44 32 46 40 54 66 56 50
42 33 80 77 81 46 40 60 63 64 76 56 57 57 70

Use a stem-leaf plot to display the data


Presentation of Data:
Here,
Maximum value = 86
Minimum value = 31
The stem-leaf plot for the given data is given below:
Stem Leaf
3 4 6 1 2 3
4 6 2 4 6 0 2 6 0
5 4 6 0 6 7 7
6 6 0 3 4
7 6 7 6 0
8 6 0 1
Presentation of Data:
By arranging the leaves in ascending order the final stem-leaf plot will be:
Stem Leaf Frequency
3 1 2 3 4 6 5
4 0 0 2 2 4 6 6 6 8
5 0 4 6 6 7 7 6
6 0 3 4 6 4
7 0 6 6 7 4
8 0 1 6 3
Total N = 30

Where, 3│1 means 31 and class interval = 10 (30-40, 40-50, …)


Presentation of Data:
Problem-2:
The typing speed of 24 students was recoded as follows:

13 12 6 8 15 18 17 24 28 23 27 23
21 20 15 18 23 25 23 13 17 18 19 18

Use a stem-leaf plot to display the data


Presentation of Data:
Here,
Maximum value = 28
Minimum value = 6
We know the number of stems lies between 5 and 25 in general.
So the stem-leaf plot for the given data will be as follows:

Stem Leaf
5 1 3
10 3 2 3
15 0 3 2 0 3 2 3 4 3
20 4 3 3 1 0 3 3
25 3 2 0
Presentation of Data:
By arranging the leaves in ascending order the final stem-leaf plot will be:
Stem Leaf Frequency
5 1 3 2
10 2 3 3 3
15 0 0 2 2 3 3 3 3 4 9
20 0 1 3 3 3 3 4 7
25 0 2 3 3
Total N = 24

Where, 5│1 means 5+1 = 6 and class interval = 5 (5-10, 10-15, …)


Presentation of Data:
Problem-3:
The price earning ratio of 20 stocks was recoded as follows:

20.8 20.9 22.0 22.3 22.6 21.7 20.4 21.4 23.3 19.8
20.9 21.0 22.6 21.5 22.2 19.4 20.4 21.5 22.7 21.3

Use a stem-leaf plot to display the data following 19│4 means 19.4
Presentation of Data:
Here,
Maximum value = 23.3
Minimum value = 19.4
The stem-leaf plot for the given data will be as follows:
Stem Leaf
19 8 4
20 8 9 4 9 4
21 7 4 0 5 5 3
22 0 3 6 6 2 7
23 3
Presentation of Data:
By arranging the leaves in ascending order the final stem-leaf plot will be:
Stem Leaf Frequency
19 4 8 2
20 4 4 8 9 9 5
21 0 3 4 5 5 7 6
22 0 2 3 6 6 7 6
23 3 1
Total N = 20

Where, 19│4 means 19.4 and class interval = 1 (19-20, 20-21, …)


Presentation of Data:
Importance or uses of stem-leaf plot:
 Range can be measured easily
 Nature of frequency distribution is easily known
 Identity of each observation is maintained
 Data can be represented easily in simple and scientific form
 Real mode can be determined
 Extreme value can be identified easily
Presentation of Data:
b) Frequency Curve:
When we plot the frequencies corresponding to different mid value of
class intervals and the free hand curve thus obtained is called frequency
curve.
Problem:
Construct a frequency curve from the following distribution:
Class 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 – 60 60 - 70 Total

Frequency 4 6 8 15 12 4 1 50

Drawing Procedure: Plot the mid point of class intervals along X-axis and
corresponding frequencies along Y-axis. Now put a dot against every mid point for
every frequency and then join the dots by freehand turning. The freehand curve
thus obtained is required frequency curve.
Presentation of Data:
Y Frequency Curve
Along X-axis 1 square = 2 units
20 Along Y-axis 1 square = 1 unit

15
Frequenc

10
y

0 5 15 25 35 45 55 65 X
Mid Value
Presentation of Data:
C) Frequency Polygon:
When we plot the frequencies corresponding to different mid value of
class intervals and the polygon thus obtained is called frequency polygon.
Problem:
Construct a frequency polygon from the following distribution:
Class 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 – 60 60 - 70 Total

Frequency 4 6 8 15 12 4 1 50

Drawing Procedure: Plot the mid point of class intervals along X-axis and
corresponding frequencies along Y-axis. Now put a dot against every mid point for
every frequency and then join the dots by strait lines one by one. Add the starting dot
with previous mid point and ending dot with next mid point. The polygon thus
obtained is required frequency polygon.
Presentation of Data:
Y Frequency Polygon
Along X-axis 1 square = 3 units
20 Along Y-axis 1 square = 1 unit

15
Frequenc

10
y

0 5 15 25 35 45 55 65 75 X
Mid Value
Frequency distribution
d) Histogram/column diagram:
Histogram is a suitable graph for representing the frequency distribution of
a continuous or grouped series(exclusive). If the class limit is inclusive, then
it must be converted to exclusive. Here the class limits are plotted along X-
axis and frequencies are plotted along Y-axis. A rectangle(column) is drawn
on each class where the length of rectangle is equal to the value of
respective frequency and width is equal to class interval. All drawn
rectangles should be adjacent to each other.

Uses:
i) frequency curve and frequency polygon can be drawn from histogram by
joining the midpoints of the rectangles.
ii) Mode can be determined from histogram of any frequency distribution
iii) Histogram is used to present social, economic and researches data.
Frequency distribution
Problem:
1) Draw a histogram from the following frequency distribution

Class 20-30 30-40 40-50 50-60 60-70


Frequency 6 8 5 4 3

Drawing procedure:
In a graph paper plot the class limits along X-axis and frequency along Y-
axis. Draw a rectangle on each class where the where the length of
rectangle is equal to the value of respective frequency and width is equal
to class interval. All drawn rectangles should be adjacent to each other.
The graph thus obtained is required histogram.
Presentation of Data:
Y Histogram
Along X-axis 1 square = 2 units
Along Y-axis 2 squares = 1 unit

7
Frequenc

5
y

8
3 6
5
4
3
1

0 20 30 40 50 60 70 80 X
Class Limits
Frequency curve from Histogram:
Y
Histogram
Along X-axis 1 square = 2 units
Along Y-axis 2 squares = 1 unit

7
Frequenc

5
y

8
3 6
5
4
3
1

0 20 30 40 50 60 70 80 X
Class Limits
Frequency polygon from Histogram:
Y
Histogram
Along X-axis 1 square = 2 units
Along Y-axis 2 squares = 1 unit

7
Frequenc

5
y

8
3 6
5
4
3
1

0 20 30 40 50 60 70 80 X
Class Limits
Frequency distribution
Problem:
2) Draw a histogram from the following frequency distribution
Class 20-29 30-39 40-49 50-59 60-69
Frequency 6 8 5 4 3

Drawing procedure:
Firstly, convert the inclusive class intervals into exclusive class intervals
by adding 0.5 with upper limit and subtracting 0.5 from lower limit.
(lower limit of next class – upper limit of previous class)/2 = 30-29/2 = 0.5
Frequency distribution

Class 19.5-29.5 29.5-39.5 39.5-49.5 49.5-59.5 59.5-69.5


Frequency 6 8 5 4 3

Now in a graph paper plot the class limits along X-axis and frequency
along Y-axis. Draw a rectangle on each class where the where the
length of rectangle is equal to the value of respective frequency and
width is equal to class interval. All drawn rectangles should be
adjacent to each other. The graph thus obtained is required
histogram.
Presentation of Data:
Y Histogram
Along X-axis 1 square = 2 units
Along Y-axis 2 squares = 1 unit

7
Frequenc

5
y

8
3 6
5
4
3
1

0 19.5 29.5 39.5 49.5 59.5 69.5 X


Class Limits
Frequency distribution
Problem:
3) Draw a histogram from the following frequency distribution
Class 20-25 25-40 40-50 50-60 60-65
Frequency 6 8 5 4 3

Drawing procedure:
Since in given frequency distribution, the class intervals are not equal
so we should use frequency density instead of frequency to get better
graph. Now the frequency density are shown below:
Frequency distribution

Class 20-25 25-40 40-50 50-60 60-65


Frequency density 1.20 0.53 0.50 0.40 0.60

Now in a graph paper plot the class limits along X-axis and frequency
density along Y-axis. Draw a rectangle on each class where the
where the length of rectangle is equal to the value of respective
frequency density and width is equal to class interval. All drawn
rectangles should be adjacent to each other. The graph thus
obtained is required histogram.
Presentation of Data:
Y Histogram
Along X-axis 1 square = 2 units
Along Y-axis 1 square = 0.1 unit

1.5
Frequenc

density

1.0
y

1.2
0.5

0.53 0.6
0.5 0.4

0 20 25 40 50 60 65 X
Class Limits
Frequency distribution
e) Ogive curve:
When we plot the cumulative frequency of a frequency distribution
against the upper limit or lower limit then the curve obtained is
called cumulative frequency curve or ogive curve.
There are two types of ogive curve;
1) Less than ogive: Here the successive frequencies increase, and cumulative
frequency are plotted against upper limit of the class. It is an upward curve.
2) More than ogive: Here the successive frequencies decrease, and
cumulative frequency are plotted against lower limit of the class. It is a
downward curve.

Uses: i) to determine median, quantiles.


ii) to compare the run rate of two teams in cricket.
Frequency distribution
Example:
Over 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
BN 6 10 15 25 28 36 42 51 60 65 72 77 85 89 94 100 112 124 135 142
NZ 4 12 20 25 32 39 45 52 55 64 69 74 82 90 92 99 115 122 130 137

Less than Ogive


160

140
BN
120
NZ
100
Run

80

60

40

20

0
0 5 10 15 20 25

Over
Frequency distribution
Problem: Draw a less than ogive and a more than ogive
from the data given below:
Frequency
Class
f
30 – 40 5
40 – 50 8
50 – 60 6
60 – 70 4
70 – 80 4
80 – 90 3
Total N = 30
Frequency distribution
Drawing Procedure: Calculate less than cumulative frequency and more than
cumulative frequency. For less than ogive cumulative frequency are plotted against
upper limit of the class and for more than ogive cumulative frequency are plotted
against lower limit of the class. Now put a dot against every class limit for every
cumulative frequency and then join the dots by freehand turning. The freehand curve
thus obtained is ogive curve.

Frequency Less than/upward


Class More than/downward F
f F
30 – 40 5 5 30
40 – 50 8 13 25
50 – 60 6 19 17
60 – 70 4 23 11
70 – 80 4 27 7
80 – 90 3 30 3
Total N = 30
Frequency distribution
Less than ogive: (in Excel)

Less than Ogive Curve


35

30
Cumulative Frequency

25

20

15

10

0
30 40 50 60 70 80 90 100
Upper Limit
Frequency distribution
Less than ogive:(Manual) Ogive Curve
Along X-axis 1 square = 2 units
Along Y-axis 1 square = 2 units

30
Cumulative
Frequency

20

10

40 50 60 70 80 90

Upper Limit
Frequency distribution
More than ogive:(In excel)
More than Ogive Curve
35

30

25
Cumulative Frequency

20

15

10

0
20 30 40 50 60 70 80 90
Lower Limit
Frequency distribution
More than ogive:(Manual) Ogive Curve
Along X-axis 1 square = 2 units
Along Y-axis 1 square = 2 units

30
Cumulative
Frequency

20

10

30 40 50 60 70 80

Lower Limit
Presentation of Data:
ii) Qualitative/categorical data presentation:
a) Component or Bar diagram
Bar diagram is a suitable graph for representing the frequency distribution
of a categorical or qualitative or time series data. Here the categories are
plotted along X-axis and frequencies are plotted along Y-axis. A rectangle
or bar is drawn on each category where the length of bar is equal to the
value of respective frequency. There must be gap among the bars. All
drawn bars should be of equal width and equal distant to each other. It
plays an important role in newspaper, journals and campaigns.
Types
1) Simple bar diagram: only one variable
2) Multiple bar diagram: two/more interrelated variables
Presentation of Data:
Problems
1) Draw a suitable diagram for the given data
Game like by Cadets Football Cricket Basketball Volleyball
No. of Cadets 120 80 60 40

Drawing Procedure
Sine the variable ‘game like by cadets’ is a categorical variable so we can
present the above data by a simple bar diagram. Here the categories (game
name) are plotted along X-axis and frequencies (No of Cadets) are plotted
along Y-axis. A rectangle or bar is drawn on each category where the length
of bar is equal to the value of respective frequency. There must be gap
among the bars. All drawn bars should be of equal width and equal distant
to each other. The diagram thus obtained is the required bar diagram.
Bar Diagram
Along Y-axis 1 square = 10 units
Y

150
Cadets
No. of

100

120
50
80
60
40

Football Cricket Basketball Volleyball X


Game Name
In Excel
Bar Diagram
140

120
120

100
No. Of Cadets

80 Football
80 Cricket
Basketball
60
60 Volleyball

40
40

20

Game Name
Presentation of Data:
2) Draw a suitable diagram for the given data of Bangladesh in last 4 years
Year 2016 2017 2018 2019
Export(million $) 140 160 150 180
Import(million $) 170 180 160 190

Drawing Procedure
Sine it is time series data with two interrelated variables so we can present the above
data by a multiple bar diagram. Here the categories (years) are plotted along X-axis
and frequencies (export, import) are plotted along Y-axis. Two adjacent rectangles or
bars are drawn on each category where the length of each bar is equal to the value of
respective frequency. There must be gap among the categories. All drawn pair of bars
should be of equal width and equal distant to each other. The diagram thus obtained
is the required multiple bar diagram.
Multiple Bar Diagram Export
Along Y-axis 1 square = 10 units

Y Import

200

150
Export, Import

100
170 180 180 190
140
160 150 160
50

2016 2017 2018 2019 X


Year
In Excel
Export Import
Multiple Bar Diagram
200 190
180 180
180 170
160 160
160 150
140
140
Export, Import

120

100

80

60

40

20

0
2016 2017 2018 2019

Year
Presentation of Data:
3) Present the following data by a suitable diagram
Year 2013 2014 2015 2016 2017 2018

Sales(Ton) 85 110 105 110 140 180

Since it is time series data so the suitable diagram will be bar diagram.
Drawing Procedure: Here the categories (years) are plotted along
X-axis and frequencies (sales) are plotted along Y-axis. A rectangle
or bar is drawn on each category where the length of bar is equal
to the value of respective frequency. There must be gap among
the bars. All drawn bars should be of equal width and equal
distant to each other. The diagram thus obtained is the required
bar diagram.
Presentation of Data:

Bar Diagram
180
180

160
140
140

120 110 110


105
Sales

100 85

80

60

40

20

0
2013 2014 2015 2016 2017 2018

Year
Presentation of Data:
b) Pie-Chart or angular diagram
Pie-Chart is a suitable graph for representing the total frequency
distribution (such as total budget) of a categorical or qualitative
data. A circle of suitable radius is drawn to represent the total
frequency where the circle is divided according to the proportion
of the magnitude of an item to the magnitude of all items. It plays
an important role in newspaper, journals and campaigns.
Problem
Draw a pie-chart for the following data
Game like by Cadets Football Cricket Basketball Volleyball Total
No. of Cadets 120 80 60 40 300
Presentation of Data:
Drawing procedure:
For drawing pie-chart the data are expressed as the segments of ,
which is shown below:
Game name No. of cadets Angle in Degree

Football 120

Cricket 80

Basketball 60

Volleyball 40

Total 300
Presentation of Data:
A circle is drawn with a suitable radius. The angles received in the center of
the circle can be drawn with the help of semi circular and the circle is
divided in different proportions. Each portion is marked individually. The
chart thus obtained is the required pie-chart.
Pie-Chart

Volleyball; 40;
13%
Football; 120;
Basketball; 60; 40%
20%

Cricket; 80; 27%


Presentation of Data:
C) Historigram/simple line graph:
A set of data depending on time series is called time series. The line
graph of the time series is called historigram. Time series is a record
of the values of a variable during a particular period taken at
successive intervals of time. When the values of the variable are
plotted against time on graph paper and the points so obtained are
joined by straight line segments. A historigram gives a rough idea
about the nature of changes in the time dependent variable.
Problem
1) Present the following data by a suitable graph.
Year 2013 2014 2015 2016 2017 2018

Sales(Ton) 85 110 105 110 140 180


Presentation of Data:
Since it is time series data so the suitable graph will be simple line graph or historigram.
Drawing Procedure: Plot the years along X-axis and sales along Y-axis. Now
put a dot against every year for every sales and then join the dots by strait
line segments one by one. The line graph thus obtained is the required
historigram.
Historigram
200
180
180
160
140
140
120 110 105 110
Sales

100 85
80
60
40
20
0
2012 2013 2014 2015 2016 2017 2018 2019

Year
Presentation of Data:
2) Draw a suitable graph for the given data of Bangladesh in last 4 years
Year 2016 2017 2018 2019
Export(million $) 140 190 150 180
Import(million $) 170 180 160 190

Since it is time series data so the suitable graph will be multiple line
graph or historigram.
Drawing Procedure: Plot the years along X-axis and exports, imports
along Y-axis. Now put a dot against every year for every export and
then join the dots by strait line segments one by one. The line graph
thus obtained is the required historigram for exports. Similarly we
will get another line graph for imports.
Presentation of Data:

Multiple line graph


200

180

160

140
Export, Import

120
Export(million $)
100
Import(million $)
80

60

40

20

0
2015 2016 2017 2018 2019 2020

Years

You might also like