Professional Documents
Culture Documents
GE 4-MMW-Week 4-5
GE 4-MMW-Week 4-5
Metalanguage
In this section, the essential terms relevant to the study of data management
and to demonstrate ULO-a will be operationally defined to establish a common frame
of reference as to how the texts work. You will encounter these terms as we go
through the study of data management. Please refer to these definitions in case you
will encounter difficulty in understanding some concepts.
1. Statistics provides us the tool through which such data are collected,
analyzed, and presented to arrive at some rich and interesting information.
These tools, which are derived from mathematics, are useful in processing
and managing numerical data to describe a phenomenon and predict
values.
2
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Essential Knowledge
To perform the aforesaid big picture (unit learning outcomes) for the fourth and
fifth weeks of the course, you need to fully understand the following essential
knowledge that will be laid down in the succeeding pages. Please note that you are
not limited to refer to these resources exclusively. Thus, you are expected to utilize
other books, research articles, and other resources that are available in the
university’s library e.g., ebrary, search.proquest.com, etc.
1. Data is a set of values collected from the variable from each of the subjects
that belong to the sample. It refers to a collection of natural phenomena
descriptors such as results from experiences, observations or experiments, or
a set of premises. It may consist of numbers, words, or images. A collection
of data values forms a data set. Each value in the data set is called a data
value or a datum.
Data can be classified according to the type of variable for which it was
drawn. There are two general types of data according to how the data vary
across cases:
1.1 Quantitative data – these are data that are usually expressed in
numerical values or obtained by counting or measuring. It can be classified as
discrete data and continuous data.
Discrete data are count data or data obtained from counting.
Examples are the number of children in a family, the number of bicycles sold,
the number of sentences in a paragraph, and number of crimes recorded in a
police station.
Continuous data are also called measurement data because data are
obtained through direct or indirect measuring. Examples are blood pressure
of a person, total land area, weight of an object, and scores in an intelligence
test. Note that not all numeric by nature are quantitative data. Some are just
mere label or name. For example, ID numbers, SSS numbers, etc. These are
numeric but considered qualitative data.
1.2 Qualitative data – also called categorical data or classificatory data.
These are not expressed in numerical values but rather are classified
according to kind or characteristic by which they differ. These data are merely
labeled and classified into categories of statistical analysis. Examples are
gender, nationality, religious affiliation, occupation, and program.
3
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
2.2 Ordinal level – is higher than the nominal level. The numbers are not
only used to classify items but also reflect some rank or order of the individuals,
items or objects. It indicates that objects in one category are not only different
from those in the other categories of the variable but they may also be ranked
as either higher or lower, bigger or smaller, better or worse than those in the
other categories. Examples are ranks given to the winners in a singing contest,
hotel classifications, and military ranks.
2.3 Interval level – is the next to the highest level of data measurement. The
measurements have all the properties of ordinal data, but in addition the
distances between consecutive numbers have meaning. The measurement
units are equal to allow us to determine how far apart the two persons or things
are.
In addition, the zero point value on this level is arbitrary. That is, zero is
just another point on the scale and does not mean the absence of the
phenomenon. Examples are temperature reading in Celsius scale, scores in
intelligence tests, and scholastic grade of a student.
2.4 Ratio level – is the highest level of data measurement. It has the same
properties as interval level but the zero-point value of this level is absolute; that
is, the zero value represents the absence of the characteristic being studied.
Examples are height, weight, time, and volume.
Nominal data are the most limited data in terms of the types of statistical
analysis that can be used with them. Ordinal data allow the researcher to
perform any analysis that can be done with nominal data and some additional
analyses. With ratio data, a statistician can make ratio comparisons and
appropriately do any analysis that can be performed on nominal, ordinal, or
interval data. Some statistical techniques require ratio data and cannot be
used to analyze other levels of data.
4
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
3.3 Experiment method is more expensive but better way to produce data.
It is used to gather data when the objective of the investigator is to determine
the cause and effect relationship of certain phenomena or variable under
controlled conditions.
3.4 Registration method is also called secondary data. In this method, the
respondents give information in compliance with or as enforced by certain
laws, policies, rules, regulations, decrees, or standard practices. The data is
kept systematized and made available to all because of the requirements of
the law.
5
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
6
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
5. Measures of Center
One type of measure being used to describe a set of data is the
measure of central tendency which yield information about the center, or
majority, of a group of numbers. It is a single value that stands for or represents
a group of values in the data set. The most common measures are the mean,
median, mode, percentile, and quartile.
7
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
x i
x i 1
n
where xi = individual value
n = total number of values
Example.
The following are the scores in a quiz by ten students in Algebra. Find the
mean score of the data set.
5 12 20 16 15 23 10 18 7 11
Solution.
From the given data set, n = 10.
Solve for the mean.
5 12 20 16 15 23 10 18 7 11 137
x
10 10
x 13.7
Weighted Mean
Sometimes, in the computation of the mean of data set, each value in
the data set is associated with a certain weight or degree of importance. In
such cases, the weighted mean is computed.
The weighted mean of a set of values can be computed by multiplying
each value with its corresponding weight and taking the sum of the products
and then divided by the total number of weights. Mathematically written as
n
w x i i
xw i 1
n
w
i 1
i
Example.
The final grades of a student in six courses were taken and are shown below.
Compute the student’s weighted mean grade.
8
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Solution.
Solve for the weighted grade of each course.
No. of Final
Course Units Grade wx
(w) (x)
Math 112 3 2.5 7.50
English 101 6 2.0 12.00
PS 25 3 1.5 4.50
Fil 1 3 1.4 4.20
Chem 1 5 2.4 12.00
PE 1 2 1.1 2.20
Σw = 22 Σ(wx) = 42.40
w x i i
42.40
xw i 1
n
22
w
i 1
i
xw 1.93
5.2 Median (denoted by x ) is the middlemost value in the data set. It divides
the given distribution into two equal parts.
Example.
Find the median of the following set of measurements.
25 41 56 34 28 67 49 37 52
Solution.
Arrange the data in ascending order
28 32 34 37 41 49 52 56 67
Example.
Find the median of the given data set.
4.5 2.8 5.6 9.2 3.5 6.7 3.9 8.4
Solution.
Arrange the data in ascending order
9
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Example.
Find the mode of the following data set.
a. 12 15 13 12 14 17 16 12 13 19
b. 3.4 2.2 3.5 3.4 2.2 2.6 2.1 3.9 2.2 3.4
c. 105 200 159 110 225 170 115 250 285 190
Solution.
a. On the first data set, 12 has the highest frequency in the distribution;
therefore, the mode is
xˆ 12
b. On the second data set, two values have the highest frequency; therefore,
there are two modes and the distribution is called bimodal. The modes
are
xˆ1 3.4 and xˆ2 2.2
c. On the third data set, there is no value that occurs most often; therefore,
there is NO mode in the distribution.
Example.
Compare the mean, the median, and the mode for the salaries of 5 employees
of a small grocery store. Which averages could best represent the salaries of
the employees?
Solution.
Computing the mean, median and mode of the salaries of employees, we got
Mean = P9,200
Median = P5,000
Mode = P3,000
The median of P5,000 better represents the average of the salaries than does
either the mean or the mode.
6. Measures of Dispersion
The measures of central tendency give information about the center of
data set. Such descriptions, however, do not adequately describe the
characteristic of the distribution. To do this, we need to compute the degree
of dispersion of the values from the average. These measures are called the
measures of dispersion or variability. It describe how spread the individual
values from the average. Among these measures are the range, variance and
standard deviation.
10
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
6.1 Range is the simplest and the easiest to compute among the measures
of dispersion but it is also the most unstable and the most unreliable measure
because it can easily affected by the extreme values. It is the difference
between the highest and the lowest values in the distribution.
R = HV – LV
σ
2
N
where x = individual value
μ = population mean
N = population size
x μ
2
σ σ 2
Example.
A sample of six street vendors along San Pedro St. were surveyed and
obtained their average daily income as follows.
Solution.
Arrange the data in column.
Income
(x)
x x x x
2
11
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
x
x 2660 443.33
n 6
s
2
n 1 6 1
s 26,586.67
2
x x
2
s 26,586.67 163.05
n 1
Therefore, the sample variance is ₱26,586.67 and the sample standard
deviation is ₱163.05.
7.1 Percentiles (denoted by Pk) are measures of relative position that divide
the distribution into 100 parts. The kth percentile is the value such that at least
k percent of the data are below that value and (100 – k) percent are above that
value.
Percentiles are also used to compare individual’s test score with the
some norm. For example, tests such as the National Secondary Achievement
Test (NSAT) are taken by high school students. A student’s scores are
compared with those of other students locally and nationally using percentile
ranks.
Percentiles are not the same as percentages. If a student gets 75
correct answers out of 100 items in an examination in his class, then he obtains
a percentage score of 75. But this will not tell his position with respect to the
rest of his class. His score could be the lowest, the highest, or somewhere in
between. But if his score of 75 corresponds to the 70th percentile, then he did
better than 70% of the students in his class.
To approximate the percentile rank of value x in the distribution, we
have
Percentile
number of values below x 0.5 100
total number of values
Example.
A 30-point quiz was given to 10 students and the scores are shown below.
What is the percentile rank of 24?
23 25 19 21 28 15 20 24 22 27
12
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Solution.
Arrange the data in ascending order.
15 19 20 21 22 23 24 25 27 28
There are 6 values below 24.
Determine the percentile using the formula.
6 0.5
Percentile 100
10
Percentile 65 percentile
This means that a student with a score 24 did better than 65% of the class.
7.2 Quartiles (denoted by Qq) are positional measures that divide the
distribution into four parts such as first quartile (Q1), second quartile (Q2) and
third quartile (Q3). The first quartile separates the first one-fourth of the
distribution from the upper three-fourths and is equal to the 25th percentile; the
second quartile separates the first half of the distribution from the upper half
and is equal to 50th percentile and also equal to the median of the distribution;
the third quartile separates the lower three-fourths of the distribution from the
upper one-fourth and is equal to the 75th percentile.
Quartiles can be obtained by first arranging the data set in ascending
order. Next, determine the median of the distribution and that median is the
value of Q2. Then determine the median of the values of the 1st half of the
distribution to get Q1. And finally, determine the median of the values of the 2nd
half of the distribution for Q3.
Example.
Find the value of Q1, Q2, and Q3 of the following scores of students in a class.
20 15 10 29 30 19 12 26 24 18
Solution.
Arrange the data in ascending order.
10 12 15 18 19 20 24 26 29 30
19 20
Q2
2
Q2 19.5
This means that 50% of the students in the class got a score of 19.5 or less.
13
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
8. Normal Distribution
A normal distribution is a very important statistical data distribution
pattern occurring in many natural phenomena, such as height, blood pressure,
lengths of objects produced by machines, etc. Certain data, when graphed as
a histogram (data on the horizontal axis, amount of data on the vertical axis),
creates a bell-shaped curve known as a normal curve, or normal distribution.
14
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
NOTE!
You can also have normal distributions with the same means but different
standard deviations.
You can also have normal distributions with the same standard deviation
but with different means.
You can also have normal distributions with different means and different
standard deviations.
Emperical Rule
Example.
The daily water usage per person in Davao City is normally distributed with a
mean of 20 gallons and a standard deviation of 5 gallons. Find and interpret
the intervals representing one, two, and three standard deviations of the mean.
15
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Solution.
For one standard deviation of the mean, approximately 68% of the people in
Davao City consumed water between 15 and 25 gallons daily. For two
standard deviations of the mean, approximately 95% of the people consumed
water between 10 and 30 gallons daily. And for three standard deviations of
the mean, nearly all of the people (99.74%) consumed water between 5 and
35 gallons daily.
xμ
z
σ
where
X - the distance between a selected value,
µ - the population mean
σ - population standard deviation
X μ 22,000 20,000
z 1.
σ 2000
b) For X = Php 17,500 with µ = Php 20,000 and σ = Php 2000, solving for z,
we have
X μ 17,500 20,000
z 1.25
σ 2000
16
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
distribution. Then we can use the table for Areas Under the Normal Curve.
You can visit this site to have a copy of a table:
https://www.westgard.com/normalareas.htm
Example. The daily water usage per person in Davao City is normally distributed
with a mean of 20 gallons and a standard deviation of 5 gallons. Let X be the
daily water usage, what percent uses less than 24 gallons?
Solution.
We graph the problem in a normal distribution graph and see that the shaded
region we are looking for is the area before X = 24.
X μ 24 20
z 0.8
σ 5
To check the probability of the z value, we would refer to the normal distribution
table which is also commonly called the z-table. To locate the probability for z = 0.8,
we look at the ones and tenth’s place value on the 1 st column of the z-table and
intersect it with the column corresponding to the hundredth’s place value of the
computed z value.
Thus, P(X < 24) = P(z < 0.8) = 0.2881 + 0.5 = 0.7881 or 78.81%. This means
that the probability that a person uses less than 24 gallons of water daily is
78.81%.
To help you understand more of this concept, please see the following videos:
https://www.youtube.com/watch?v=mtbJbDwqWLE
https://www.youtube.com/watch?v=2tuBREK_mgE
17
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Ondaro et al. (2018). Mathematics in the modern world, e-book. Mutya Publishing
House, Inc.
Chapter 2 – Introduction
http://124.105.95.237/index.php/s/AY5PS7tCmWCET24
18
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Activity 1. Now that you know the most essential concepts in the study of the
nature of Mathematics. Let us try to check your understanding of
these concepts. You are directed to answer at least three (3)
exercises from
19
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Activity 1. Based from the most essential concepts in data management and the
learning exercises that you have done, please feel free to write your
arguments or lessons learned below.
1.
2.
3.
20
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
1.
2.
3.
4.
5.
21
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Metalanguage
In this section, the essential terms relevant to the study of correlation and
regression analysis and to demonstrate ULO-b will be operationally defined to
establish a common frame of reference as to how the texts work. You will encounter
these terms as we go through this topic. Please refer to these definitions in case you
will encounter difficulty in understanding some concepts.
3. Scatter Plot a graph in which the values of two variables are plotted along
two axes, the pattern of the resulting points revealing any correlation present.
7. Line of Best Fit is a straight line that is the best approximation of the given
set of data. It is used to study the nature of the relation between two variables.
22
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
how strong the linear relationship is between two variables, and is heavily
relied on by researchers when conducting trend analysis.
Essential Knowledge
To perform the aforesaid big picture (unit learning outcomes) for the fourth
and fifth weeks of the course, you need to fully understand the following essential
knowledge that will be laid down in the succeeding pages. Please note that you are
not limited to refer to these resources exclusively. Thus, you are expected to utilize
other books, research articles, and other resources that are available in the
university’s library e.g., ebrary, search.proquest.com, etc.
1. Scatterplot. The scatter plot is a visual way to describe the nature of the
relationship between the variables. It is a graph of the ordered pairs (x, y) of
numbers consisting of the independent variable x and the dependent variable
y.
Basically the independent variable is scaled along the x-axis and the
dependent variable is scaled along the y-axis. Graphing the data on scatter
plot gives preliminary information about the shape and spread of the data.
Example.
Construct the scatter plot of the data shown for the advertising cost (in thousands)
and sales (in thousands) from several companies and determine whether there
seems to be a linear relationship between the two variables.
Advertising Cost 12 8 10 5 12 14 11 8 6
Sales 20 12 15 10 18 20 18 10 11
Solution.
Step 1. Draw and label the x and y axes.
Step 2. Plot each point on the graph as shown.
23
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
–1 ≤ r ≤ +1.
24
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
n xy x y
r
n x x n y y
2 2 2 2
25
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
0. When the null hypothesis is not rejected, it means that the value of r is not
significantly different from 0 (zero) and is probably due to chance only.
Example.
The average normal daily temperature (in degrees Celsius) and the
corresponding average monthly precipitation (in inches) for the month of June are
shown here for seven randomly selected cities. Determine if there is a relationship
between the two variables.
Temperature (x) 30 27 28 32 27 23 18
Precipitation (y) 3.4 1.8 3.5 3.6 3.7 1.5 0.2
Solution.
Arranging the data in table as shown.
City x y xy x2 y2
A 30 3.4 102.00 900.00 11.56
B 27 1.8 48.60 729.00 3.24
C 28 3.5 98.00 784.00 12.25
D 32 3.6 115.20 1024.00 12.96
E 27 3.7 99.90 729.00 13.69
F 23 1.5 34.50 529.00 2.25
G 18 0.2 3.60 324.00 0.04
Σx = 185 Σy=17.70 Σxy = 501.80 Σx =5019.00
2
Σy = 55.99
2
n xy x y
r
n x x n y y
2 2 2 2
26
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
4. Computation:
n2
tc r
1 r 2
72
tc 0.891
1 0.891
2
tc 4.925
5. Decision.
Reject H0 because | tc | > | tt | and conclude that there is a significant
relationship between the average normal daily temperature and the
corresponding average monthly precipitation.
27
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
a
y b x a y bx
n n
Example.
A law enforcement officer obtained a data on the performance rating of police
offices and the crime solution efficiency in their respective area of responsibility for
the last 6 months. Use the equation of the regression line to predict the crime solution
efficiency of the city with the police office performance rating of 82.
Month x y xy x2
1 85 89 7565 7225
2 89 90 8010 7921
3 91 92 8372 8281
4 93 92 8556 8649
5 84 88 7392 7056
6 89 90 8010 7921
n= 6 Σx = 531 Σy =541 Σxy =47,905 Σx2 = 47,053
28
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
n xy x y
b
x x
2 2
n
6 47,905 531 541
b
6 47,053 531
2
b 0.4454
Using the data in Example above, and determine by how much of the
variation of the crime solution efficiency is due to the variations of the
performance rating of police office. Upon computing the correlation
coefficient, we get r = 0.959.
Solve for the coefficient of determination.
r 2 91.97%
This result means that 91.97% of the variation in the crime solution
efficiency is accounted for by the variations in the performance rating of the
29
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
police office in the area. The rest of the variation, 0.0803 or 8.03%, is
unexplained and is called the coefficient of alienation.
Ondaro et al. (2018). Mathematics in the modern world, e-book. Mutya Publishing
House, Inc.
Chapter 2 – Introduction
http://124.105.95.237/index.php/s/AY5PS7tCmWCET24
30
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Activity 1. Now that you know the most essential concepts in the study of the
data management. Let us try to check your understanding of these
concepts. You are directed to answer exercises number 1 and 2 from
31
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
Activity 1. Based from the most essential concepts in data management and the
learning exercises that you have done, please feel free to write your
arguments or lessons learned below.
1.
2.
3.
32
College of Arts and Sciences Education
General Education - Mathematics
2nd Floor, DPT Building, Matina Campus, Davao City
Phone No.: (082)300-5456/305-0647 Local 134
1.
2.
3.
4.
5.
33