Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Group presentation of Data

Group presentation of Data

Syed S. Hossain
Institute of Statistical Research and Training
University of Dhaka, Bangladesh.
shahadat@isrt.ac.bd

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Grouping Data: Concept


Consider Length of 40 Maple Leaves
136 164 150 132 144 125 149 157 146 158
140 147 136 148 152 144 168 126 138 176
163 119 154 165 146 173 142 147 135 153
140 135 161 145 135 142 156 156 145 128
A Stem & Leaf dispaly of the data gives:

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Grouping Data: Concept


The Stem & Leaf dispaly may be thoughts as

Grouping Data
By suitably organizing data, we can often make a large and
complicated set of data more compact and easier to understand.
Grouping involves, as the term implies, putting data into groups
rather than treating each observation individually.
Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Grouping Data (Cont.)

Frequency Distribution
A tabular arrangement of data by classes together with the
corresponding number of items in each class is called a frequency
distribution or frequency table. The given data can be more
rigorously tabulated as follows
Class Class Frequency Relative Cumulative
Interval Mark Frequency Frequency
118-128 123 3 0.075 3
128-138 133 6 0.150 9
138-148 143 14 0.350 23
148-158 153 9 0.225 32
158-168 163 5 0.125 37
168-178 173 3 0.075 40
Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Grouping Data (Cont.)

Terminologies
Class intervals: A symbol defining a class such as 118-128 in
the above table is called a class interval.
Class limits: The end numbers 118 and 128 are called class
limits, the smaller number 118 is the lower class limit and the
larger number 128 is the upper class limit.
Size of a class interval: The size of a class interval is the
difference between the lower and upper class limits. Class size
is also known as class width or class length.
In the given Table, Class width = 128-118=10 .

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Grouping Data (Cont.)


Terminologies (Cont.)
Class Boundaries or Cut points: If the non-overlapping limits
are considered, for example 118-127, then 128-137, then the
upper limit of the first class theoretically includes all
measurements from 117.5 to 127.5. These numbers, indicated
briefly by the exact numbers 117.5 to 127.5 are called
boundaries or true class limits. Note that in case of
overlapping class intervals, the class boundaries are same as
the class limits.
The class mark: The class mark is the midpoint of the class
interval and is obtained by adding the lower and upper class
limits and dividing by 2. The class mark is also called the
class midpoint.
The class mark of the interval 128-118 is 118+128
2 = 123 .
Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Grouping Data (Cont.)

Terminologies (Cont.)
Frequency: The number of values that fall in to a particular
class is known as frequency.
Relative Frequency: The proportion of values that fall in to a
particular class is known as relative frequency.
Cumulative Frequency: The total frequency of all values less
than the upper class boundary of a given class interval is
called the cumulative frequency up to and including the class
interval.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

General rules for forming frequency distribution


The following steps can be considered as the general rules for
constructing frequency distributions:
1 Determine the largest and smallest numbers in the raw data
and thus find the range (difference between largest and
smallest numbers).
2 Determine the number of classes.
3 Calculate class interval using the following formula:
size of the class interval
Highest Value − Lowest Value
= i= .
Number of classes
4 Determine the number of observations falling into each class
interval, i.e. find the class frequencies. This is best done by
using a tally or score sheet.
Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Estimating number of classes


Ones professional judgment can determine the number of classes.
Too many classes or too few classes might not reveal the
basic shape of the set of data.
As a general rule, it is best to not use less than 5 nor more
than 15 classes in the construction of a frequency distribution.
The number of classes can be estimated based on the number
of observations, n, using one of the rules:
Rules for estimating number of classes
1 The 2k rule: As the number of classes, select the smallest
integer, k (whole number) such that 2k ≥ n.
2 Alternatively, estimate the number of classes using formula

Number of classes = 1 + 3.322 × log10 n .

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Graphically presenting Grouped Data: Histogram

A histogram or frequency histogram consists of a set of rectangles


having

Basis on a horizontal
axis with centres at
the class marks and
length equal to the
class interval sizes,
Areas proportional to
class frequencies.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Graphically presenting Grouped Data: Frequency Polygon

Frequency Polygon is the


graphical presentation of a
frequency distribution with
the class frequencies
plotted on the axis
corresponding to the axis
values of the mid-value (or
class mark).

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd
Group presentation of Data

Graphically presenting Grouped Data: Ogive

The graphical presentation


with the cumulative
frequency plotted on the
Y axis against the upper
class boundary of the
given class on the X axis
is known as Cumulative
frequency polygon or an
Ogive.

Syed S. Hossain Institute of Statistical Research and Training University of Dhaka, Bangladesh. shahadat@isrt.ac.bd

You might also like