2-StatProb11 Q4 Mod2 Correlation-Analysis Version3

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 27

STATISTICS &

PROBABILITY
Quarter 4- Module 2:
Correlation Analysis

Department of Education ● Republic of the Philippines


Statistics & Probability – Grade 11
Alternative Delivery Mode
Quarter 4– Module 2: Correlation Analysis
First Edition, 2020

Republic Act 8293, section 176 states that: “No copyright shall subsist in any work
of the Government of the Philippines. However, prior approval of the government agency or
office wherein the work is created shall be necessary for exploitation of such work for profit.
Such agency or office may, among other things, impose as a condition the payment of
royalties.

Borrowed materials included in this module are owned by their respective copyright
holders. Effort has been exerted to locate and seek permission to use these materials from
the respective copyright owners. The publisher and author do not represent nor claim
ownership over them.

Published by the Department of Education


Secretary:
Undersecretary:
Assistant Secretary:
Development Team of the Module
Authors: Monina C. Raagas

Editor: Glenn C. Arandilla


Reviewers:
Illustrator:
Layout Artist:
Management Team: Nelson B. Absin

Printed in the Philippines by: _____________________________


Department of Education – Bureau of Learning Resources (DepEd – BLR)
Office Address: ______________________________________
Telefax: ______________________________________
E-mail Address:______________________________________
STATISTICS &
PROBABILITY
Quarter 4- Module 2:
Correlation Analysis

This instructional material was collaboratively developed and reviewed by


educators from public and private schools, colleges, and or/universities. We
encourage teachers and other education stakeholders to email their
feedback, comments, and recommendations to the Department of Education
at action@deped.gov.ph.
Department of Education • Republic of the Philippines
TABLE OF CONTENTS
Cover Page
Copyright Page
Title Page
Table of Contents
Introduction
Lesson 1 Correlation Analysis 1
What I Need To Know 1
What I Know 2
What’s In 4
What’s New Activity 1 4
What Is It 8
What’s More Creating Scatterplot in Spreadsheet or Excel 9
What I Have Learned 9
What I Can Do 10
Assessment 10
Lesson 2 Pearson Product-Moment Correlation 12
What I Need To Know 12
What I Know 12
What’s In 13
What’s New Activity 1 14
What Is It 16
What’s More Correlation Coefficient Software 17
What I Have Learned 17
What I Can Do 18
Assessment 18
Answer Key 21
References 22
INTRODUCTION

This module, as part of the response in crafting the Alternative Delivery Module

Learning Resource, is made for you as students who took up Statistics and

Probability subject. The resource focuses on topics under Correlation Analysis

which include constructing scatterplot, computing the Pearson product coefficient

and solving problems involving correlation analysis. Activities are suited to your own

pace and capacity. You are also advised to use applications like Excel in your

computer in accomplishing some objectives. This is to make you enjoy the

comparison of manual computation and use of formula in the computer application.

The module starts with a Pre-test to assess how much knowledge you have about

the lessons. At the end part, an Assessment ensures that you gained an

understanding and skill on the objectives set.

For the facilitator, teacher or parent, this module serves as a guide in achieving the

most essential learning competencies set by the Department of Education’s

curriculum guide. Furthermore, this is not to say that you limit only in the resources

available in this module but it is hoped that you may supplement materials and

strategies that can help the student better.


Lesson
Correlation Analysis
1

Quarter: Fourth Week: 7th

No. of Days: 4 No. of hours:

What I Need to Know

At the end of this lesson, you are expected to:


 illustrate the nature of bivariate data;
 construct a scatter plot;
 describe shape (form), trend (direction), and variation (strength)
based on a scatter plot.

To achieve the objectives of this module, follow the instructions below:


Take time to read the lessons and study.
Follow the directions and perform the activities required in the lessons.
Answer the questions in the pre-test and assessment.
Internalize and practice the use of the knowledge learned in the
application to real situation as provided in the module.

REMINDER: DO NOT WRITE ANYTHING IN THE MODULE. ANSWER IN


A SEPARATE NOTEBOOK OR PAPER.

1
What I Know

Directions: Encircle the letter that corresponds to the best answer.


1. Which scatterplot shows most likely a positive correlation?

a. A only c. both A and C


b. B only d. Both B and D
2. In terms of strength of association, how do you compare scatterplot I
with II?

Scatterplot I Scatterplot II

a. The strength of association in Scatterplot I is greater.


b. The strength of association in Scatterplot II is greater.
c. The strength of association in both scatterplots II is the same.
d. The strength of association in the scatterplots cannot be
compared.
2
3. Which of these most likely describes the correlation between grades
in Math and Physics?
a. Strong, positive c. Weak, positive
b. Strong, negative d. Weak, negative

4. This scatterplot shows the relationship between which two variables?

a. Speed of an airplane (x) vs. distance traveled in one hour (y)


b. Outside air temperature (x) vs. air conditioning costs (y)
c. Age of an adult (x) vs. height of an adult (y)
d. Distance traveled (x) vs. gas remaining in the tank (y)

5. Which scatterplot below best describes the table of values for the
number of hours studied and the test scores?

a. c.

b. d.

3
What’s
In
Remember in your previous lessons, you were asked to plot ordered
pairs in the rectangular coordinate system? Let us try if you can still do
it.
Plot the following points in the rectangular coordinate system.
1. (-3, 2)
2. (3, 3)
3. (1, -5)
4. (4, -4)
5. (-3, -5)
6. (3, 5)
7. (-2, 4)
8. (1, -3)
9. (-5, 0)
10. ( 0, 5)

What’s New

Bivariate Data
Data in statistics is sometimes classified according to how many
variables are in particular study. When you conduct a study that looks at a
single variable, that study involves univariate data. For example, you study
a group of students to find out their average grade.
Bivariate data is when you are studying two variables. These variables
are compared to find the relationships between them. For example, age
might be one variable and weight might be another variable. Another is
when you want to find out the temperature and the ice cream sales.
Using correlation analysis, we can find out the relationship of variables
in a bivariate data. Many businesses, marketing and social science
questions and problems could be solved using bivariate data sets. For
instance, is there a link between child obesity and family income? This is
where correlation analysis is helpful.
4
Correlation analysis is a method of statistical evaluation used to study
the strength of a relationship between two numerically measured,
continuous variables(e.g. height and weight). This particular type of
analysis is useful when a researcher wants to establish if there are possible
connections between variables.

Activity
Arm Span and Height of a Person
1
Steps Solution

1. Using a meterstick or ruler, Length of Height


measure the length of the arm span Student the Arm
and height of 10 of your classmates in Span (cm) (cm)
centimeters. Tabulate the results.
1
2
3
4
5
6
7
8
9
10

2. Graph the points corresponding to


the bivariate data. Put labels on the
x- and y- axis.

3. Present your data. As you present


them, identify the variables and
describe how the points are
scattered.

5
The graph you have constructed is called a scatterplot. By examining the
points, can you say that there is a relationship between the length of the
arm span and the height of a person?

Activity
2 Number of Times Late and Grade of a Student
Steps Solution

1. Ask 10 of your classmates of their Number of Grade in


average grade in the first period Student Times First
subject and the number of times late Late Period (%)
in coming to school. Tabulate the
1
results.
2
3
4
5
6
7
8
9
10

2. Graph the points corresponding to


the bivariate data. Put labels on the
x- and y- axis.

3. Present your data. As you present


them, identify the variables and
describe how the points are
scattered.

Is there a relationship between the number of times late in coming to


school and the grade of a student in the first period?

6
Activity
3
Weight of a Person and Number of Facebook Friends
Steps Solution

1. Ask 10 of your classmates of their Number of


Weight
weights and the number of friends in Student Facebook
(kg)
their Facebook account. Tabulate the Friends
results.
1
2
3
4
5
6
7
8
9
10

2. Graph the points corresponding to


the bivariate data. Put labels on the
x- and y- axis.

3. Present your data. As you present


them, identify the variables and
describe how the points are
scattered.

Is there a relationship between the weight of a person and the number


of Facebook friends?

7
What is It

A scatterplot, or diagram, is a type of mathematical diagram using


Cartesian coordinates to display values for two variables in a set of data.
The independent variable is plotted along the horizontal axis (x) and the
dependent variable is plotted along the vertical axis (y). Scatterplot provides
a visual representation of the correlation, or relationship between the two
variables. It shows the direction and strength of a relationship of the
variables.
All correlations have two properties: direction and strength.
 Positive correlation: Both variables move in the same direction. In
other words, as one variable increases, the other variable also
increases. As one variable decreases, the other variable also
decreases. An upward trend in points indicates a positive
correlation. Examples: IQ vs. academic performance;
salary vs. job satisfaction
 Negative correlation: The variables move in opposite directions. As
one variable increases, the other variable decreases. As one variable
decreases, the other variable increases. A downward trend in points
indicates a negative correlation. Examples:
academic performance vs. no. of hours watching tv;
stress vs. job performance
 Zero or no correlation: It means that there is no apparent
relationship between the two variables.
Example: shoe size vs. salary;
socio-economic status vs. grades
The strength of a correlation is determined by its numerical value. It
may be perfect, very high, moderately high, moderately low, very low, and
zero.

The diagram above shows some examples of scatter plots and correlations.

What’s More 8

Creating Scatterplot in Spreadsheet or Excel


What’s interesting is you can create your scatterplot from your data
using Excel. Here are the steps you need:
 Select the worksheet range that contains the data.
 Click On the Insert tab, click the XY (Scatter) chart command
button.
 Select the Chart subtype that doesn't include any lines.
 Confirm the chart data organization.
 Annotate the chart, if appropriate.Add those little flourishes to your
chart that will make it more attractive and readable. For example, you
can use the Chart Title and Axis Titles buttons to annotate the chart
with a title and with descriptions of the axes used in the chart.
 If you want to add a trendline, click Add Chart Element menu's
Trendline command button.

What I Have Learned

Based on this lesson, answer the following questions:


1. Bivariate
What aredata involves
bivariate data?theGive
studyanofexample.
two variables. An example is
the IQ and age of students in a population.
2. What is a scatterplot? What is the importance of scatterplot?
 A scatterplot is a mathematical diagram using Cartesian
3. coordinates
Describe a positive
to displaycorrelation?
values for Atwonegative correlation?
variables in a set of data. It
4. provides a visualofrepresentation
In the analysis of thetwo
a scatterplot, what correlation, or relationship
elements should be
between the two variables.
considered?
 In a positive correlation, both variables move in the same direction.
5. In
How is the
other strength
words, as one of variable
correlation determined?
increases, the other variable also
increases. In a negative correlation,9 the variables move in opposite
directions. As one variable increases, the other variable decreases.
 The two elements that should be considered in the analysis of a
scatterplot are: direction and strength of the correlation.
 The strength of a correlation is determined by its numerical value.
It may be perfect, very high, moderately high, moderately low, very
low, and zero.
What I Can Do

With the lesson studied, we want to know if we can apply the use of
scatterplot in real life. Suppose the number of people of different ages
who died of COVID-19 virus on the month of April in our region is
taken. Construct the scatterplot of number of people died against age.
Show your output using Excel.

Assessment

A. For each of the following case, tell whether the relationship is positive,
negative or no correlation.
1. The more students enroll in a school, the more teachers are needed.
2. The wealthier a person is, the more friends he has.
3. A student who has many absences has a decrease in grades.
4. As one increases in age, often one's agility decreases.
5. The longer your hair grows, the more shampoo you will need.
10
B. Determine whether the following bivariate data are correlated or not. If
they are correlated, tell the direction of the association. Evaluate
whether correlation is most likely strong or weak.
1. time spent in a supermarket and money spent
2. income and value of car driven
3. number of children and time spent cleaning the house by the mother
4. amount spent on gas and distance traveled by car each week
5. age and reaction time of persons over 18 years of age
C. Match the letter below which best describes the following scatterplot.

1. 2.

3. 4.
A. Strong negative correlation
B. Strong positive correlation
C. Moderate positive correlation
D. Low negative correlation
E. Zero correlation
D. Construct a scatterplot for the following data and use it to comment on
the form, direction, and strength between the variables.

Age of a
1. person, 11 12 13 14 15 16 17 18 19 20
years
Weight,
kg 40 42 38 35 45 51 48 48 50 47

2. Age of a
car, 0.5 1 1.5 2 3 4 4.5 5 6 7
years

Mileage,
16 15 10 12 10 12 11 10 11 8
km/L

Lesson 11
Pearson Product-Moment
2 Correlation

Quarter: Fourth Week: 8th

No. of Days: 4 No. of hours:

4
What I Need to Know

At the end of this lesson, you are expected to:


 calculate the Pearson’s sample correlation coefficient;
 solve problems involving correlation analysis.

What I Know

Directions: The table shows the correlations for the four graphs below.
Match each graph to the correlation coefficient.

A. B.

12
C. Compute and interpret r for the following data.
1.
x 20 30 40 50 60

y 100 90 85 60 50
2.
x 6 15 30 12 20

y 3 6 15 5 15

What’s In

Check your readiness for this lesson by answering the following


exercises.

A. Sketch the scatterplot of the following that shows:


1. Strong positive correlation
2. Weak positive correlation
3. Perfect negative correlation
4. No correlation
B. Determine whether the correlation between the given bivariate data
is most likely positive, negative, or zero.
1. hours spent sleeping and hours spent awake
2. years of education and yearly salary
3. shoe size and salary.
4. temperature and ice cream sales
5. Car speed and travel time

13

Age and Weight of Children


A sample of 6 children was selected; data about their age in years and
weight in kilograms were recorded as shown in the following table . It is
required to find if there is a relationship between age and weight. Then,
interpret the result.

Child Age, X Weight, Y


1 7 12
2 6 8

Steps 3 8 12
Solution
4 5 10

1. Construct a table shown5 Child 6 X Y11 X2 Y2 XY


on the right side. Complete
6 19 7 1213
the the entries in each
column. Get the sum of all 2 6 8
entries below the columns.
3 8 12
4 5 10
5 6 11
6 9 13
ΣX = ΣY = ΣX2 ΣY2 ΣXY
= = =

2. Substitute the values


obtained in the formula,

r=

14

The value r is called the Pearson correlation coefficient. It indicates the


degree of relationship between two variables. What do you think is the
degree of relationship between age and weight?
Activity
2 Mathematics and Physics Scores
Steps Solution

1. Below are the data of Student X Y X2 Y2 XY


Mathematics and Physics
1
scores of 5 students at
Mabuhay High School. 2
Compute for the value of r
by completing the table on 3
the right side. 4
5
ΣX ΣY = ΣX2 ΣY2 ΣXY
=
Student Math Physics = = =

1 55 66

2 93 89

3 89 94

4 60 52

5 90 84

2. Substitute the values


obtained in the formula,
15

Can you state the correlation coefficient for the relationship between
Math and Physics scores?
What is It

Pearson Correlation Coefficient


The most common coefficient of correlation is known as the Pearson
product-moment correlation coefficient, or Pearson’s r. It is a measure of the
linear correlation (dependence) between two variables X and Y, giving a
value between +1 and −1. It was developed by Karl Pearson from a related
idea introduced by Francis Galton in the 1880s.

When conducting a statistical test between two variables, it is a good


idea to conduct a Pearson correlation coefficient value to determine just
how strong that relationship is between the two variables. If the coefficient
value is in the negative range, then that means the relationship between the
variables is negatively correlated, or as one value increases, the other
decreases. If the value is in the positive range, then that means the
relationship between the variables is positively correlated, or both values
increase or decrease together.

To determine the strength of the computed r:


If r=0 no association or correlation
If 0 < r < ±0.25 very low correlation
If ±0.25 < r < ±0.50 moderately low correlation
If ±0.50 < r < ±0.75 moderately high correlation
If ±0.75 < r < ±1 very high or strong correlation
If r = ±l perfect correlation
16
What’s More

Correlation Coefficient Software

Most spreadsheet editors such as Excel, Google sheets and OpenOffice


can compute correlations for you. The illustration below shows an example:
Using the Excel, click on an empty cell where you want the correlation
coefficient to be entered. Then enter the following formula.

=PEARSON(array1, array2)

Simply replace ‘array1‘ with the range of cells containing the first
variable and replace ‘array2‘ with the range of cells containing the second
variable.

For the example above, the Pearson correlation coefficient (r) is 0. 76.

What I Have Learned

Based on this lesson, answer the following questions:


1. What is Pearson correlation coefficient?

2. What is the formula for computing r?

3. What are the indicators for determining the strength and direction of
correlation?
17
 Pearson product-moment correlation coefficient, or Pearson’s r is a
measure of the linear correlation (dependence) between two
variables X and Y, giving a value between +1 and −1.
 The formula for computing r is

 The direction of correlation is indicated by the sign of r while its


strength is indicated by the absolute value of the computed value.
What I Can Do

With the lesson studied, suppose we want to determine the strength


of the relationship between the number of years in studying to the
amount of salary received of 10 persons in your community. Compute
the Pearson coefficient r using Excel. What conclusion can you derived
from the computation?

Assessment

A. Encircle the letter of the correct answer.


1. Which of the following values cannot represent a correlation
coefficient?
a. r = 1.08 b. r = 0.95 c. r = 0 d. r = - 1.0
2. What could be the approximate value of the correlation coefficient
for a weak negative correlation?
a. −0.85 b. −0.16 c. 0.21 d. 0.90
3. Which value of a correlation coefficient represents the strongest
relationship between the two variables ?
a. -0.94 b. 0 c. 0.5 d. 0.91
4. Which value of r represents data 18
with a strong negative linear
correlation between two variables?
a. −1.07 b. −0.89 c. −0.14 d. 0.92
5. A study compared the number of years of education a person
received and that person's average yearly salary. It was determined
that the relationship between these two quantities was linear and the
correlation coefficient was 0.91. Which conclusion can be
made based on the findings of this study?
a. There was a weak relationship.
b. There was a strong relationship.
c. There was no relationship.
d. There was an unpredictable relationship.

B. Match the letter that corresponds as an interpretation of the scatter


plot below.
A. strong negative correlation
B. moderate negative correlation
C. strong positive correlation
D. zero correlation
E. moderate positive correlation

1. 2.

3. 4.

C. Compute and interpret r for the following


19 data given.
1.
1 3 6 10 12
x

y 5 13 25 41 49

2.

x 1 3 5 7 9

y 44 34 24 14 4
3.
x 1 3 6 9 11

y 12 28 37 28 12

D. Find the value of Pearson coefficient r. Give your conclusion about


the variables of the studies.
1. The diameter of the longest lichens growing on gravestones were
measured. Data gathered show the following:

Age of
gravestone 9 18 20 31 44 52 53 61 63 63
X (years)

Diameter of
2 3 4 20 22 41 35 22 28 32
lichen

2. In a biology experiment a number of cultures were grown in the


laboratory. The numbers of bacteria, in millions, and their
ages, in days, are given below.

Age
1 2 3 4 5 6 7 8
X (days)

No. of
bacteria 34 106 135 181 192 231 268 300
Y(mil)

20
Answer Key
Lesson 1 Lesson 2
What I Know What I Know
1. a A. Graph A =1
2. a Graph B = -1
3. a Graph C = 0
4. d Graph D = -0.72
5. c B. Graph A = 0.96
Assessment Graph B = -0.90
A. 1. Positive Graph C = 0.72
2. No correlation Graph D = -0.42
3. Negative C. 1. r = -0.97 ; strong negative correlation
4. Negative 2. r = 0.90 ; strong positive correlation

5. Positive
B. 1. Strong positive correlation
2. Strong positive correlation
3. Weak negative correlation
4. Strong positive correlation
5. Strong negative correlation
C. 1. B
2. C
3. E
4. A

D. 1. 2.

21
Lesson 2
Assessment
A.
1. a
2. b
3. a
4. b
5. B
B. 1. D
2. C
3. B
4. A
C. 1. r = 1 ; perfect positive correlation
2. r = -1 ; perfect negative correlation
3. r = 0 ; no correlation
D. 1. r = 0.86 ; There is a strong positive correlation between age of grave stone
and diameter of lichen.
2. r = 0.99 ; There is a strong positive correlation between the number of days
and the number of bacteria

References
Belecina, Rene R. et. al. Statistics and Probability. P. Florentino ST., Sta.
Mesa Heights, Quezon City: Rex Printing Company, Inc., 2016

Websites
https://www.onlinemathlearning.com/scatter-plots.html
https://courses.lumenlearning.com/boundless-
statistics/chapter/correlation/
https://www.dummies.com/software/microsoft-office/excel/how-to-create-a-
scatter-plot-in-excel/

22

You might also like