Professional Documents
Culture Documents
MFTMW Chapter 4 - Data Management
MFTMW Chapter 4 - Data Management
DATA MANAGEMENT
Learning competencies:
Introduction
Organizing or managing data is very important to have a clear picture of what the data
represents. Organization of data is very essential particular in conducting a survey, case study,
researches, economic growth, population growth and many others. Organized data depicts the
true picture of the study.
Data management is also used in school administration. The number of enrollees, number
of dropouts, number of graduating students, number of employees, salaries of employees, school
furniture and equipment, school supplies and many others. These are the important data needed
in administering an educational institution.
Data administration is widely used in every organization public or private. Organization
of data is very important and it is procedural but it is beneficial if it is precisely organized.
Frequency Distribution Table It is a table that shows an organized data that represents the
result of a certain study for presentation and interpretation. The following are the important
terms needed in constructing a frequency distribution table.
Raw Data is an unfinished product. It is the data gathered through observation that needs
to be organized.
Range is the difference of the highest value minus the lowest value in a gathered
observation.
Frequency. It is the definite number of observation in every class interval.
Class Limit. It is the highest value and lowest value in a class interval. The highest value
is the upper limit and the lowest value is the lower limit in every class limit or class
interval.
Class Boundaries. It is obtained by deducting 0.5 to the lowest value in every class limit
and adding 0.5 to the highest value in every class limit. The highest value is the upper
class boundary and the lowest value is the lower class boundary in every class limit.
Class Mark. It is middle point in every class limit. It is the sum of the upper and lower
limit divided by 2 in every class limit.
Cumulative Frequency It is the sum of the frequencies accumulated starting to the first
class interval to the last class interval.
Relative Frequency. It is obtained by dividing each frequency by the total number of
observation multiplied by 100.
There are two ways of how to find the number of classes (k);
Rule 1. Using the positive integer k such that 2k ≥ n, where n is the number of
observation.
Range
Suggested Class Interval or class size = Number of classes
¿
¿
Rule 2.
Range
Suggested Class Interval =
1+ 3.322(logarithm of total number of observation)
Example :
The following data are number of enrollees in different colleges in a particular university
in Pangasinan.
Following rule # 2
R HV −LV 75
Suggested Class Interval = = = =
1+ 3.322 log n 1+ 3.322 log 20 5.322021646
15.03 = 15
Histogram. It is a graph in which the class mark are on the horizontal axis (x axis) and the class
frequencies are on the vertical axis ( y axis).The height of the bars signify the frequency in each
class limit , and the bars touches each other. Illustration below is an example of a histogram.
10
2 __
0
7 12 17 22 27 32
Horizontal axis (x axis) – Class mark
Vertical axis ( y axis) – frequency
Frequency Polygon. It is a graph that uses points to represent a data, which are
connected by line.
10 ●
8 ●
6 ●
4 ●
2 ● ●
0
7 12 17 22 27 32
Horizontal axis ( x axis) – Class mark
Vertical axis ( y axis) - frequency
Pie Graph Pie Chart. It is a circle that is apportioned into parts that illustrates the
relative frequencies of the data belonging to the different areas. The data in the pie chart
are expressed in percentage form and the total of all the data is equal to 100%. Figure
below is an example of a pie graph.
1112
15% 30 %
10101
110%
18
20% 25%
Bar Graph. It is similar to histogram. The difference between a histogram and a bar
graph is that in histogram the rectangle that represents the frequency touches each other
while in a bar graph, the rectangle that represents the frequency does not touch each
other.
10
2 __
0
7 12 17 22 27 32
Horizontal axis (x axis) – Class mark
Vertical axis (y axis) - frequency
12 5 20 30 25 15 10 27 15 35 10 42 28 11 13
15 34 18 20 23 18 40 35 17 19 20 16 43 22 33
Solution
Range = Highest Value – Lowest Value = 43 – 5 = 38
Number of classes = 2k = 25 = 32, 32 is greater than the total number of observation
which is 30, therefore the number of classes is 5.
R 38
Suggested class interval or class size = = = 7.6 ≈ 8
k 5
Frequency Distribution Table
Histogram
12
11
10
9
8
7
6
5
4
3
2
1
0
8.5 16.5 24.5 32.5 40.5
Frequency Polygon
12 ●
11
10
9
8
7
6
5 ● ● ●
4
3 ●
2
1
0
8.5 16.5 24.5 32.5 40.5
1112
16.7%
40%
10101
1 16.7%
16.6% 10.%
4.2 Measures of Central Tendency
Measures of central tendency are designed to quantify what is the typical or the
average value in a set of data, in an array of data or in a set of observation. The concept is
essential in our daily life as we usually engaged in the different measures of central
tendency. Calculating the average of a group of data, determining the middle value in a
set of data, and identifying the most frequently used data. For in stance if you want to
find the average kilowatt consumed for one year, determining what should be placed in
the middle of your art, and to find the most frequently used shampoo. These are some of
its importance. The three measures of central tendency are, mean, median, and mode.
4.2.1 Mean, refers to average. It is the most widely used measures of central tendency. It
is obtained by getting the sum of all the scores in a set of observation divided by the total
number of observation. When a set of data is composed of less than or equal to 30, in
symbol ≤ 30, we called it ungrouped data. For ungrouped data, we will deal with sample
mean, and it is denoted by the symbol X called “x bar”. When a set of data is composed
of more than 30, in symbol, >30, we called it grouped data and we usually deal with
population mean, and it is denoted by the symbol μ called “mu”.
Ungrouped Data
When a number of observation is less than or equal to 30 then it is an ungrouped
data.
To find the sample mean for ungrouped data, we used the formula:
X =
∑ xi , where ∑ is the symbol for summation, n is the number of observation in a
n
set of data.
Example: Find the sample mean given the following scores of eight students in a given
quiz composed of 25 items, 22, 23, 8, 10, 15, 9, 13, 25
X =
∑ xi =
x1 + x 2 + x 3+ x 4+ x5 + x 6 + x 7 + x 8
=
22+ 23+8+10+15+ 9+13+25 125
=
n n 8 8
= 15.62
To find the population mean for ungrouped data, we used the formula.
μ=
∑ xi , where N is the number of observation in a population.
N
Example: Find the population mean of the weight in pounds of students enrolled in a
religious institution, 110, 95, 100, 98, 120, 130, 140, 145, 115, 148, 160
μ=
∑ xi = x1 + x 2 + x 3+ x 4+ x5 + x 6 + x 7 + x 8+ x 9+ x 10 + x 11
N N
110+95+ 100+98+120+130+ 140+145+115+ 148+160 1361
= = = 123.73
11 11
Grouped Data
When a number of observations are more than 30 then it is called grouped data.
To find the mean of grouped data we used the formula,
k
X= ∑ f i Mi ,
i=1
n
where k is the number of observation
f i is the frequency of every class interval
M i is the class mark of every class interval
n is the sum of all the frequencies for all class intervals.
Example:
The following data are the scores of 45 students in a long quiz in statistics and
probability.
12 5 20 30 25 15 10 27 15 35 10 42 28 11 13
15 34 18 20 23 18 40 35 17 19 20 16 45 22 33
25 18 19 26 40 33 18 15 33 26 41 17 19 23 22
Solution:
Range = Highest Value – Lowest Value = 45 – 5 = 40
Number of classes = 2k = 26 = 64, 64 is greater than the total number of observation
which is 45, therefore the number of classes is 6.
R 40
Suggested class interval = = = 6.67≈ 7
k 6
X= ∑ f i Mi =
1,067
= 23.71
i=1
45
n
4.2.2 Median. It is the middle point in an array of data or it is the middle value in a set of
data. Median is usually denoted by the symbol “Mdn” for both grouped and ungrouped
n+1
data. The formula used in finding the median for ungrouped data is, Mdn = .
2
Ungrouped Data
The following are the steps in finding the median of ungrouped data.
1). Arranged the data in ascending order.
n+1
2). Using the formula for median, , and the result is a decimal number like 4.5, take
2
the sum of the 4th and the 5th value divided by 2, and the result will be the median, like
Example # 1
Determine the median, given the following data, 15, 25, 18, 36, 29, 30, 12, 10
Step 1. Arranging the data in ascending order, 10, 12, 15, 18, 25, 29, 30, 36
n+1 8+1 9
Step 2. Mdn = = = = 4.5
2 2 2
18+25 43
Taking the sum of the 4th and 5th data divided by 2 equals = = 21.5
2 2
Example # 2.
n+1
Using the formula Mdn = and the result is a whole number like 5 then take the value
2
of the 5th data and that will be the median like for example,
Step 1. Given the following data, 13, 10, 7, 22, 18, 26, 33, 20, 27
Step 2. Arranging the data in ascending order the result is, 7, 10, 13, 18, 20, 22, 26, 27,
33.
n+1
Using the formula Mdn = and the result is a whole number like 5 then take the value
2
n+1 9+1 10
of the 5th data and that will be the median and then Mdn = = = = 5, then
2 2 2
the 5th value is 20 and that is the median.
Grouped Data
The formula for finding the median for grouped data
( )
n
−CF mdn−1
Mdn = LCB mdn + 2 c where;
f mdn
12 5 20 30 25 15 10 27 15 35 10 42 28 11 13
15 34 18 20 23 18 40 35 17 19 20 16 45 22 33
25 18 19 26 40 33 18 15 33 26 41 17 19 23 22
Solution
Range = Highest Value – Lowest Value = 45 – 5 = 40
Number of classes = 2k = 26 = 64, 64 is greater than the total number of observation
which is 45, therefore the number of classes is 6.
R 40
Suggested class interval = = = 6.67≈ 7
k 6
( )
n
Mdn = LCB mdn + 2
−CF mdn−1
f mdn
c = 18.5 +
22.5−17
12 ( )
7 = 18.5 + 3.21 = 21.71
4.2.3 Mode is simply the value or the data that occurred frequently. It is the value that
was most repeated in a set of data. When there is no repeated value then there is no
modal. If one value is repeated, it is unimodal, when two values are repeated, it is
bimodal, when 3 values are repeated, it is trimodal and more than 3 values are repeated
it is polymodal or multimodal. This is true for ungrouped data.
Mo = LCB mo + ( f mo−f 1
2 f mo−f 1 −f 2 )
c, where
Ungrouped Data
Example # 1. Given the following data, 23, 25, 30, 40, 45. There is no repeated value,
then it is no modal.
Example # 2. Given the following data, 23, 25, 30, 40, 45.40, 33, 25,40. The most value
that occurred is 40, then it is unimodal.
Example # 3. Given the following data, 23, 25, 30, 40, 45.40, 33, 25, 23, 30. The most
repeated value are 25, 30, and 40, then it is trimodal.
Grouped Data
Example:
The following data are the scores of 45 students in a long quiz in statistics and
probability.
12 5 20 30 25 15 10 27 15 35 10 42 28 11 13
15 34 18 20 23 18 40 35 17 19 20 16 45 22 33
25 18 19 26 40 33 18 15 33 26 41 17 19 23 22
Solution
Range = Highest Value – Lowest Value = 45 – 5 = 40
Number of classes = 2k = 26 = 64, 64 is greater than the total number of observation
which is 45, therefore the number of classes is 6.
R 40
Suggested class interval = = = 6.67≈ 7
k 6
Mo = LCB mo + ( f mo−f 1
)
2 f mo−f 1 −f 2 (
c, = 11.5 +
13−4
2 (13 )−4−12)7 = 11.5 + (6.3) = 17.8
4.3.1 Range. It is obtained when you get the difference between the highest value and the
lowest value in a set of data. It is the simplest and easiest way of determining the
measures of dispersion for there are only two values to consider the highest and the
lowest value.
Ungrouped Data
Example;
The following data are the number of dengue patient who are confined in a hospital for
the month of January to December 2018.
Jan Feb March April May June July Aug Sep Oct Nov Dec
15 18 20 14 22 34 40 45 48 42 26 21
Grouped Data
Example:
The following data are the scores of 45 students in a long quiz in statistics and
probability.
12 5 20 30 25 15 10 27 15 35 10 42 28 11 13
15 34 18 20 23 18 40 35 17 19 20 16 45 22 33
25 18 19 26 40 33 18 15 33 26 41 17 19 23 22
Solution
Range = Highest Value – Lowest Value = 45 – 5 = 40
Number of classes = 2k = 26 = 64, 64 is greater than the total number of observation
which is 45, therefore the number of classes is 6.
R 40
Suggested class interval = = = 6.67≈ 7
k 6
Standard Deviation
Standard Deviation is the most widely used measures of dispersion. All data are
included in the computation. It is obtained by taking the square root of the variance.
s=
√ ( x −x )2
∑ n−1
Formula in finding the population standard deviation
σ=
√∑ ( x −μ )2
N
The following data are the salary in pesos of 10 employees in Maharlika Broadcasting
Company.
The following are the steps in finding the sample variance for ungrouped data.
1). Find the mean.
2). Get the total of the squared of a particular data minus the mean
3) Divide the total of the squared of a particular data minus the mean by the total number
of observation minus 1.
20,000 18,000 30,000 25,000 22,000 40,000 33,000 35,000 28,000 27,000
Step 2. Get the total of the squared of a particular data minus the mean
Step 3. Divide the total of the squared of a particular data minus the mean by the total
number of observation minus 1.
s=
√ ∑
( x −x )2
n−1
=
√ ∑ 431,600,000
9
= √ 47,955.56 = 218.99
Example:
The following data are the scores of 45 students in a long quiz in statistics and
probability.
12 5 20 30 25 15 10 27 15 35 10 42 28 11 13
15 34 18 20 23 18 40 35 17 19 20 16 45 22 33
25 18 19 26 40 33 18 15 33 26 41 17 19 23 22
Solution
Range = Highest Value – Lowest Value = 45 – 5 = 40
Number of classes = 2k = 26 = 64, 64 is greater than the total number of observation
which is 45, therefore the number of classes is 6.
R 40
Suggested class interval = = = 6.67≈ 7
k 6
X= ∑ f i Mi =
1,067
= 23.71
i=1
45
n
Step 2 Get the total of the squared of a particular data minus the mean, and then multiply
it with frequency of every class.
2
f i ( M i−μ )
987.20
986.18
35.04
139.90
906.24
1,860.50
∑ f i ( M i−μ )2 = 4,915.06
Step 3 Divide the total of the squared of a particular data minus the mean by the total
number of frequency to get the population variance.
f ( x−μ )2 4,915.06
σ =∑ i
2
= = 109.22
N 45
Step 4 Take the square root of the population variance to get the population standard
deviation.
σ=
√ ∑
( x −μ )2
N
=
√ 4,915.06
45
= √ 109.22 = 10.45
4.4.1 Quartiles. It divides the given set of data into 4 parts, first quartile, second quartile,
third quartile and fourth quartile. Q1 is equivalent to 25% , Q2 is 50%, Q3 is75% and Q4
k (n+1)
is 100%. It is obtained by the formula; Qk = , for ungrouped data where;
4
k = the quartile location
n = number of obsevation
Qk = Quartile location
( )
nk
−¿ CF Q −1
and the formula for grouped data is Q =LCB + 4 c where; k
k Q
fQ k
k
4.4.2 Deciles. It divides the given set of data into 10 parts, first decile up to tenth decile.
k (n+1)
It is obtained by the formula; Dk =
10
4.4.3 Percentiles . It divides the given set of data into 100 parts, first percentile up to one
k (n+1)
hundredth percentile. It is obtained by the formula; Pk =
100
Ungrouped data
The following are the numbers of out of school youth in 8 barangays in one of the town
in Pangasinan.
k (n+1) 7(8+1) 63
Dk = = = = 6.3, round off to the nearest whole number, it is 6 th
10 10 10
data. The 6th decile is 20.
Grouped data
Example:
The following data are the scores of 45 students in a long quiz in statistics and
probability.
Compute for
a). 3rd quartile
b). 7th decile
c). 50th percentile.
( )
nk
Step 3. Q =LCB + 4
k Q k
−¿ CF Q −1
fQ
c = 25.5 +
k
34−29
5 ( )
7 = 25.5 + 7 = 32.5
k
Step 2. The cumulative frequency which is 32 belongs to the 4 th class interval; the lower
class boundary is 25.5.
Step 3.
( )
nk
( )
−¿ CF D −1 32−29
10 k
c = 25.5 + 7 = 25.5 + 4.2 = 29.7
Dk = LCB D + 5
k
fD k
Step 2. The cumulative frequency which is 22 belongs to the 3 rd class interval; the lower
class boundary is 18.5
Step 3.
( )
nk
( )
−¿ CF D −1 22−17
100 c =
k
18.5 + 7 = 18.5 + 2.92 = 21.42
Pk =LCB D + 12
k
fDk
Name: ____________________________ Date: _________ Score : ______
Activity # 4.1
The following data are the scores of 40 BS Biology students who took up the NMAT that
was given last 2019. Construct a Frequency Distribution Table.
78 55 65 40 28 33 55 40 90 87
80 74 60 56 25 18 93 83 65 54
87 43 62 71 92 88 79 68 55 45
43 27 30 79 48 54 67 76 77 82
Name: ____________________________ Date: _________ Score : ______
Activity # 4.2
The following data are the scores of 40 BS Biology students who took up the NMAT that
was given last 2019. Construct a histogram, frequency polygon and pie graph.
78 55 65 40 28 33 55 40 90 87
80 74 60 56 25 18 93 83 65 54
87 43 62 71 92 88 79 68 55 45
43 27 30 79 48 54 67 76 77 82
Name: ____________________________ Date: _________ Score : ______
Activity # 4.3
Multiple Choice; Encircle the correct letter that satisfies every problem.
1. Find the mean given the following data, 25, 33, 47, 55, 65, 74, and 78.
a). 53.85 b). 53.86 c). 53.87 d). 53.88 e). 53.89
2). Determine the mode given the following data, 22, 33, 55, 16, 18, 19, 22, and 17
a). 33 b). 18 c). 22 d). 55 e). 19
3). Given the following data, 36, 44, 18, 10, 25, 30, 46, find the median.
a). 36 b). 25 c). 30 d). 44 e). 18
4. Find the mean given the following data, 0, 42, 25, 96, 74, 22, 0, 10.
a). 33.62 b). 33.63 c). 44.83 d). 44.84 e). 33.64
5. The following are the scores of 5 basketball players in two games, 5, 0, 10, 8, 7, 12, 20,
18, 15, 7. Find the median.
a). 8 b). 10 c). 9 d). 12 e). 7.5
7. The following are the height in inches of 12 PMA students, 66, 70, 65, 68, 66, 70, 65,
67, 69, 70, 68, 66. Find the mode.
a). 65 b). 66 c). 70 d). 65,66 e). 66,70
8.The following are the salaries of 8 employees in a commercial bank, P15,000, P25,100,
P30,200, P18,400 P30,300, P32,700. P22,500, P19,500. Determine the mean.
a). P24,212.50 b). P24,212 c). P24,213 d). P24,000 e). P24,200
9. Given the following data, 27.5, 32.5, 27.25, 25.5, 27.55, 18.55, 17.25. Find the median.
a). 27.5 b). 27.55 c). 18.55 d). 27.25 e). 25.5
10. The following are the number of students in 6 sections who participated in the
seminar, 5, 8, 7, 4, 6, 3. Find the mean
a). 5 b). 6 c). 5.5 d). 7 e). 4
Name: ____________________________ Date: _________ Score : ______
Activity # 4.4
The following data are the scores of 70 students who took the college entrance test given
by Adelphi University.
Activity # 4.5
For the first semester of school year 2019 – 2020, the following data are number of
enrollees in Laureano University. Compute for the range, variance and standard
deviation.
Activity # 4.6
The following data are the commissions of 50 real estate broker of Filart Real Estate
Corporation for the month of October 2021.
Activity # 4.7
For the past twelve months for the year 2019, the following data are the number of tourist
who visited Subic Bay. Find P65, D8, and Q3.
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
95 105 55 65 70 60 30 20 25 58 85 63
Name: ____________________________ Date: _________ Score : ______
Activity # 4.8
The following data are the salaries of 40 faculty members of a private educational
institution in Pangasinan. Find P50, D9, and Q2.