Professional Documents
Culture Documents
Data Distribution-1
Data Distribution-1
Min = 15
Max = 150
Median(2nd quartile)=30
1st quartile(Q1) = 19
3rd quartile (Q3) = 50.5
PRESENTATION OF DATA
• In most cases, useful information is not immediately evident from the mass of
unsorted data.
information they contain in a way that will show patterns of variation clearly.
• Ordered array
• Frequency distribution
Line graphs
Bar graph
Histogram
Pie chart
Frequency polygon
Frequency curve
An ordered array is a listing of the values in order of magnitude from the smallest to
• This will enable us to know the range over which the items are spread
• Ordered array is an appropriate way of presentation when the data are small in size
consideration
overlapping intervals such that each value in the set of observations can be placed
in one, and only one, of the intervals. These intervals are called "class intervals".
• The two end values of a class interval are called class limits
• The smaller value is lower class limit and the bigger value is the Upper class limit
• The number of items or values belonging to each class interval is called the class
frequency
• Frequency distribution arranged in a grouped form is referred to as grouped
frequency distribution
The following table gives the hemoglobin level (g/dl) of a sample of 50 men.
17.0, 17.7, 15.9, 15.2, 16.2, 17.1, 15.7, 17.3, 13.5 ,16.3,
14.6, 15.8, 15.3, 16.4, 13.7, 16.2, 16.4, 16.1, 17.0, 15.9
15.9, 15.3, 13.9, 16.8, 15.9, 16.3, 17.4, 15.0, 17.5, 16.1,
14.2, 16.1, 15.7, 15.1, 17.4, 16.5, 14.4, 16.3, 17.3, 15.8.
• We wish to summarize these data using the following class intervals:
Solution:
• Sample size = n = 50
• Max= 18.3
• Min= 13.5
• Notes:
The grouped frequency distribution for the hemoglobin level of the 50 men is:
Class Interval (Haemoglobin level) Frequency (No. of men)
13.0-13.9 3
14.0-14.9 5
15.0-15.9 15
16.0-16.9 16
17.0-17.9 10
18.0-18.9 1
Total n = 50
• Cumulative Frequencies: When frequencies of two or more classes are added up, such
This frequencies help as to find the total number of items whose values are less than or
• Relative Frequencies: Relative frequencies express the frequency of each value or class
the cumulation from the lowest size of the variable to the highest size, the
distribution' and if the cumulation is from the highest to the lowest value the
cumulative frequency.
• Relative frequency = frequency/n
• Percentage frequency = Relative frequency × 100%
The no. of people whose hemoglobin levels are between 17.0 and 17.9 = 10
The no. of people whose hemoglobin levels are less than or equal to 15.9 = 23
The no. of people whose hemoglobin levels are less than or equal to 17.9 = 49
The percentage of people whose hemoglobin levels are between 17.0 and
17.9 = 20%
• From cumulative percentage frequencies:
The percentage of people whose hemoglobin levels are less than or equal to 14.9 =
16%
The percentage of people whose hemoglobin levels are less than or equal to 16.9 =
78%
i.e. frequency density of a class = frequency of the class/ width of the class.
Class Interval True Class Interval Midpoint Frequency
(Boundary points)
13.0-13.9 12.95-13.95 13.45 3
E.G:
Note:
• There are no gaps between true class intervals. The end-point (true upper
limit) of each true class interval equals to the start-point (true lower limit) of
the following true class interval
True limits (or class boundaries)
• They are points that demarcate the true upper limit of one class and the True
lower limit of the next.
• The true limits are what the tabulated limits would correspond with if one
could measure exactly.
• e.g. the class boundary between classes (13-13.9) and (14-1.14.9) is 13.95.
• It is the upper boundary for the former and lower boundary for the latter.
Class boundaries may replace class limits during statistical manipulations.
• The width of a class is found from the true class limit by
subtracting the true lower limit from the upper true limit of any particular
class.
e.g.
using the above frequency distribution table, class width for the first class is
1 i.e. (13.95-12.95) = 1
GRAPHICAL PRESENTATION OF DATA
Lines
dots or
figures.
• The drawings are meant for the non statistical minded people who want to
study the relative values or frequencies of persons or events.
• For the statistical-mined persons, they are for quick eye readings.
Importance of Diagrammatic Representation
• They have greater attraction than mere figures. They give delight to the eye
and add a spark of interest.
• They help in deriving the required information in less time and without any
mental strain.
• They may reveal unsuspected patterns in a complex set of data and may
suggest directions in which changes are occurring. This warns us to take an
immediate action.
• They have greater memorising value than mere figures. This is so because the
impression left by the diagram is of a lasting nature.
• It can give only an approximate idea and as such where greater accuracy is
• The most common ones used depend on whether the data in question is
quantitative continuous or
Histogram
Frequency polygon
Frequency curve
• Bar diagram
• There are, however, general rules that are commonly accepted when
constructing graphs.
Titles are usually placed below the graph and it should again question what ?
Where? When? How classified?
Each of the bars are drawn to represent the size of the class interval by its
width and the frequency in each class-interval by its height.
Variable characters of the different groups are indicated on the horizontal line
(x-axis) called abscissa while frequency, i.e. number of observations is marked
on the vertical line (y-axis) called ordinate.
The size of the rectangle is determined by the intervals in between the classes.
Age groups in years No. of women
15-19 8
20-24 16
25-29 32
30-34 28
35-39 12
40-44 4
In other words, it is obtained by joining the mid points of the tops of the
rectangles in a histogram by straight lines.
Frequency polygon (Open)
Frequency Polygon (Closed)
Note: It is not essential to draw histogram in order to obtain frequency polygon. It can be drawn with out erecting
rectangles of histogram as follows:
It is used when sets of data are to be illustrated on the same diagram such as
birth and death rates, birth of diabetics and non diabetics, etc.
bell shaped
rectangular distribution
Skewed positively
Skewed negatively
Cumulative Frequency Curve (Ogive)
• When the cumulative frequencies of a distribution are graphed the resulting
curve is called Ogive Curve or Cumulative frequency curve.
• The cumulative frequencies are then plotted corresponding to the upper limits
of the classes.
• Note: One can find the median, quartiles and percentiles using ogive curve.
Heights of groups (cm) Frequencies Cumulative
frequencies
160-162 10 10
162-164 15 25
164-166 17 42
166-168 19 61
168-170 20 81
170-172 26 107
172-174 29 136
174-176 30 166
176-178 22 188
178-180 12 200
Total 200
Cumulative frequency diagram showing height values of median (Q2),
• first or lower quartile (Q1), third or upper quartile (Q3) and tenth
percentile (P10)