Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Republic of the Philippines

Department of Education
Region I
SCHOOLS DIVISION OF ILOCOS NORTE

Statistics and
Probability
Quarter 4 – Module 19:
Regression Analysis

SDOIN_CORE_Q4_Stat_and_Prob_Module19
Statistics and Probability
Crafting-Resources-for-Accessible-and-Flexible-Teaching (CRAFT)
Quarter 4 – Module 19: Regression Analysis
First Edition, 2023
Republic Act 8293, section 176 states that: No copyright shall subsist in
any work of the Government of the Philippines. However, prior approval of the
government agency or office wherein the work is created shall be necessary for
exploitation of such work for profit. Such agency or office may, among other things,
impose as a condition the payment of royalties.
Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand
names, trademarks, etc.) included in this book are owned by their respective
copyright holders. Every effort has been exerted to locate and seek permission to
use these materials from their respective copyright owners. The publisher and
authors do not represent nor claim ownership over them.
Published by the Department of Education
Secretary: Sara Z. Duterte
Undersecretary: Gina O. Gonong

Printed in the Philippines by ______________________________


Schools Division of Ilocos Norte
Office Address: Brgy. 7B, Giron Street, Laoag City, Ilocos Norte
Telefax: (077) 771-0960
Telephone No.: (077) 770-5963, (077) 600-2605
E-mail Address: ilocos.norte@deped.gov.ph
Statistics and
Probability
Quarter 4 – Module 19:
Regression Analysis

MELCs:
1. Identifies the independent and dependent variable.
(M11/12SP – IVi-1)
2. Calculates and interprets the slope and y – intercept
of the regression line. (M11/12SP – IVi-3)
3. Interprets the calculated slope and y – intercept of
the regression line. (M11/12SP – IVi-4)

Prepared by:

DEO MARCO D. DELLOSA


SHS Teacher III
Pagsanahan National High School
Introductory Message
This Contextualized Learning Module (CLM) is prepared so that you, our
dear learners, can continue your studies and learn while at home. Activities,
questions, directions, exercises, and discussions are carefully stated for you to
understand each lesson with ease.
This CLM is composed of different parts. Each part shall guide you
step-by-step as you discover and understand the lesson prepared for you.
Pre-test is provided to measure your prior knowledge on the lesson. This will
show you if you need to proceed in completing this module or if you need to ask
your facilitator or your teacher’s assistance for better understanding of the lesson.
At the end of this module, you need to answer the post-test to self-check your
learning. Answer keys are provided for all activities and tests. We trust that you
will be honest in using them.
Please use this module with care. Do not put unnecessary marks on any
part of this CLM. Use a separate sheet of paper in answering the exercises and
tests. Likewise, read the instructions carefully before performing each task.
If you have any question in using this CLM or any difficulty in answering the
tasks in this module, do not hesitate to consult your teacher or facilitator.
Thank you.
What I Need to Know
This module was specifically developed and designed to provide you fun and
meaningful learning experience, with your own time and pace.
The module is divided into two lessons, namely:
● Lesson 1 – Identifying Dependent and Independent Variable
● Lesson 2 – Calculating and Interpreting the Computed Slope and Y –
Intercept of the Regression Line
After going through this module, you are expected to:
1. define dependent and independent variables;
2. identify the dependent and independent variables in a sentence or
problem;
3. calculates the slope and y – intercept of the regression line; and
4. interprets the calculated slope and y – intercept of the regression line.

What I Know
Directions. Read the following questions carefully and choose the letter of your
answer. You may use a separate sheet of paper.
1. In the equation 𝑦' = 3 + 4𝑥 , what is the slope?
𝐴. 3 B. 4 C. 𝑦' D. 4𝑥

2. In the equation 𝑦' = 10 − 2𝑥 , what is the value of the y-intercept?


A. − 10 B. − 2 C. 10 D. 𝑦'
3. Which of the following scenarios could give you a meaningful regression
analysis?
A. The value of 𝑟 is not significant.
B. Correlation will be done after the regression analysis.
C. There is no linear relationship between the variables.
D. There is a strong negative linear relationship between the variables.

1 SDOIN__CORE_Q4_Stat_and_Prob_Module19
4. In a regression line, how do you call the magnitude of the change in one
variable when the other variable changes at a unit?
A. Marginal change C. Unit change
B. Regression change D. Variable change
5. If the equation of the regression line is 𝑦 = 5 +. 123𝑥, how can it be interpreted?
'

A. The slope of the line is .123.


B. Every unit of change in the value of 𝑥, the value of 𝑦 also changes at 5
units on average.
C. Every unit of change in the value of 𝑦, the value of 𝑥 also changes at 5
units on average.
D. Every unit of change in the value of 𝑥, the value of 𝑦 also changes at .123
unit on average.
6. Which of the following statements is FALSE about bivariate?
A. It involves one variable.
B. It involves two variables.
C. It deals with causes or relationship.
D. Its two variables are dependent and independent.
7. Which of the following variables depends on other factors, measured, and
presumed as the “effect”?
A. bivariate
B. constant
C. dependent
D. independent
8. Which of the following variables describes something that is stable and
unaffected by other variables you are trying to measure and is presumed the
“cause”?
A. bivariate
B. constant
C. dependent
D. independent
9. What variable is something that is influenced and affected?
A. bivariate
B. constant
C. dependent
D. independent
10. Consider the following data:
x 1 2 3 4 5 6 7
y 4 3 8 6 12 10 8
Calculate the slope and y – intercept of the regression line.
2 SDOIN__CORE_Q4_Stat_and_Prob_Module19
A. – 3 and – 1. 071
B. 1 and 1. 071
C. 2 and 1. 071
D. 3 and 1. 071

Lesson Identifying Dependent and


1 Independent Variable
It is important to understand variables because they are being studied and
interpreted based on the given data. A study may take into consideration several
variables. These variables may be of many types and levels but in this lesson, you
will be identifying independent and dependent variables in given situations. Check
your readiness for this lesson by answering the following exercises.

What’s In
Activity 1
Directions: Encircle the variables in each situation below and determine whether
the situation involves univariate or bivariate data.
Univariate or
Statements Bivariate
1. A researcher investigated salary/income and civil status of
government employees.
2. The veterinarian listed the weight of the newborn puppies.
3. The school nurse recorded the age and the blood pressure of the
teachers.
4. A vendor keeps track of how much ice candy they sell everyday
versus the daily temperature.
5. A STEM student surveyed Grade 10 students on the number of
hours spent in using a cell phone and their previous grade.

Guide Questions:
1. How were you able to differentiate univariate from bivariate data?

2. Which among the statements above may deal with relationship between
variables?
3 SDOIN__CORE_Q4_Stat_and_Prob_Module19
When we are examining bivariate data, the two variables could depend on
each other, and one variable could influence another. In this case, try to answer
the next activity.

What is New
Activity 2
Directions: Match the pictures in Column A to Column B. Then, answer the
questions that follow.
Column A Column B
1. A.

2. B.

3. C.

4. D.

5. E.

Guide Questions:
1. How did you match each picture in Column A to Column B? What are the
things you considered?
4 SDOIN__CORE_Q4_Stat_and_Prob_Module19
2. In general, what do the pictures in Column A and Column B represent?
3. Suppose you didn’t eat breakfast, what could possibly happen to you?
4. If you didn’t pass an examination, what could possibly be the reason?
5. Which situation do you think should happen first before another situation
happens? Explain your answer.

In the activity, you were able to identify situations that involve relationships
between two variables, a skill that will be of help in understanding our next
lesson. You can also use your language skills dealing with cause-and-effect
relationship for better understanding. Let’s find out as you go through the lesson.

What is It
In the previous activity, you became familiar with bivariate data. Bivariate
data always involve two variables. One of these variables is the dependent variable
and the other one is the independent variable.

Dependent variable depends on other variables or factors. It is something


that is influenced and affected. It is also associated with the word effect or outcome.
Independent variable affects the dependent variable. It is something you
have control over, one which you can choose and manipulate. However, in some
cases, you may not be able to manipulate the independent variable. It is commonly
known as the cause or the reason behind changes.
For example, the researcher wants to determine the effects of use of social
media in the academic performance of students in Mathematics. The bivariate data
in the study are use of social media and academic performance. The academic
performance depends on the use of social media, or we can say that academic
performance is affected using social media. Therefore, independent variable here is
the use of social media and the dependent variable is the academic performance.

What’s More
5 SDOIN__CORE_Q4_Stat_and_Prob_Module19
Activity 3. Put Me in the Box!

Directions: Identify the independent and dependent variables in each question


stated below.
Questions Independent Dependent
Variable Variable
1. How does logical thinking develop
critical thinking?
2. What are the effects of Koreanovelas on
the Filipino value system?
3. In what way does collaborative learning
increase communicative competence?
4. To what extent does texting decrease
students’ grammatical competence?
5. What corrupt practices trigger one’s
resignation?

What I have Learned


Directions: Complete the following statements. In nos. 3-4, choose the
expression on the parentheses that best completes the sentences.
1. data always involve two variables
2. Bivariate data has variable and variable.
3. Dependent variable (depends on, affects) the other variable.
4. Independent variable (depends on, affects) the other
variable.
5. variable is related with the words “outcome” or “effect”.
6. variable is linked as the “cause” or the “reason” behind the changes.

6 SDOIN__CORE_Q4_Stat_and_Prob_Module19
Calculating and Interpreting
Lesson of Calculated Slope and Y –
2 Intercept of the Regression
Line
In the previous lesson, we learned that the commonly used statistic to
measure correlation is the Pearson coefficient correlation, or simply r. We also
learned how to compute r using a formula. Further, we interpreted the computed r
in terms of its direction and length.
In this lesson, we will take a deeper look at the trend of the line. We will go to
its more accurately analysis by getting its mathematical equation and how it is
used in prediction. The field of statistics that deals with this is regression analysis.

What’s In
The data below shows the ages of students 𝑥 in a certain school, and the
corresponding number of them having smartphones 𝑦. Find the equation of the
regression line and predict the number of students with smartphones with the age
of 20. Consider the variables to be correlated and that the correlation is significant.

Age (x) No. of Students with Smartphones (y)


13 19
14 32
16 37
17 45
19 49

7 SDOIN__CORE_Q4_Stat_and_Prob_Module19
What is New
The regression line is also called as the line of best fit. Its significance is in
enabling us to interpret data trends and help us in making predictions based on
that data, the latter which is to be discussed further in the next lesson.
Again, please take note that in doing regression, you first need to consider
the following assumptions:
a. There exists a relationship between the variables; and
b. The relationship is tested to be significant.
The stated conditions are necessary to be first met, otherwise doing a
regression analysis would be totally pointless.
A scatterplot is one way of illustrating a line of best fit. The figure below
shows a scatterplot of a data of two variables. Notice that several lines can be
drawn on the graph near the points. With this, you should be able to draw the line
of best fit. Best fit means that the sum of the squares of the vertical distances from
each point to the line is at a minimum.

The Equation of a Regression Line


Going back in our algebra concepts, an equation of a line is given by
𝑦 = 𝑚𝑥 + 𝑏, where 𝑚 stands for the slope and 𝑏 for the y-intercept. Similarly, an
equation of a regression line is given by 𝑦' = 𝑎 + 𝑏𝑥, where 𝑏 is the slope and 𝑎 is
the y-intercept.
Furthermore, the corresponding formulas for the y-intercept 𝑎 and the slope
𝑏 are as follows:
2
(∑𝑦)(∑𝑥 )−(∑𝑥)(∑𝑥𝑦)
𝑎= 2
2
𝑛(∑𝑥 )−(∑𝑥)

8 SDOIN__CORE_Q4_Stat_and_Prob_Module19
𝑛(∑𝑥𝑦)−(∑𝑥)(∑𝑦)
𝑏= 2
2
𝑛(∑𝑥 )−(∑𝑥)

where 𝑛 is the number of data pairs.


The rounding rule for both 𝑎 and 𝑏 is up to three decimal places.

What is It
Activity 4. Find Me!
Given the data below, find the equation of the regression line and provide an
interpretation of the results.
No. of Study Hours Final Grade in Math
Student (𝑥) (𝑦)

A 2 79
B 3 83
C 5 85
D 9 88
E 11 89
F 15 93
Solution
Before we can successfully proceed to solving for the equation of the
regression line, we need to solve first for the necessary summations. As such, a
completed table like the one shown below would be of great help.

Student No. of Study Final Grade in Math 𝑥𝑦 2


𝑥
Hours (𝑥) (𝑦)

A 2 79 158 4
B 3 83 249 9
C 5 85 425 25
D 9 88 792 81
E 11 89 979 121
F 15 93 1395 225
∑𝑥 = 45 ∑ 𝑦 =517 ∑ 𝑥𝑦 =3998
2
∑𝑥 =465

The values needed for solving the equation are as follows:


9 SDOIN__CORE_Q4_Stat_and_Prob_Module19
𝑛 = 6, since there are six pairs of data.
∑ 𝑥 = 45

∑ 𝑦 = 517

∑ 𝑥𝑦 = 3998

2
∑ 𝑥 = 465

Solving for the y-intercept𝑎, we get


2
(∑𝑦)(∑𝑥 )−(∑𝑥)(∑𝑥𝑦)
𝑎= 2
2
𝑛(∑𝑥 )−(∑𝑥)

(517)(465)−(45)(3998)
= 2
6(465)−45

240405−179910
= 2790−2025

60495
= 765

= 79. 078

Solving for the slope 𝑏, we also get


𝑛(∑𝑥𝑦)−(∑𝑥)(∑𝑦)
𝑏= 2
2
𝑛(∑𝑥 )−(∑𝑥)

6(3998)−(45)(517)
= 2
6(465)−45

23988−23265
= 2790−2025

723
= 765

=. 945

Hence, the equation of the regression line 𝑦' = 𝑎 + 𝑏𝑥 is𝑦' = 79. 078 +. 945𝑥
where the slope is .945 and the y-intercept is 79. 078.The y-intercept is the value
you get when 𝑥 = 0. That is, it is the value at some point where the line intersects
the y-axis.
Interpretation
10 SDOIN__CORE_Q4_Stat_and_Prob_Module19
Marginal change is the magnitude of the change in one variable when the
other variable changes exactly one unit. In the problem, the value of the slope 𝑏,
which is .945, is the marginal change. This means that for every change in the
value of 𝑥, which is the number of study hours, the value of 𝑦 which is the grade
also changes at .945 unit on the average. Similarly, the value of the y–intercept 𝑎 is
79.078. This means that the grade of a student would be 79.078 if he/she has zero
hour of study.
TRY this! Solve the following problems. Write your solutions in your activity
notebook.
Listed below are heights in centimeters and weights in kilograms of 6
teachers. Determine the regression equation.
Teacher A B C D E F
Height (in cm) 160 162 167 158 167 170
Weight (in kg) 50 59 63 52 65 68

What I have Learned

What’s More
Directions: Consider the following data below. Calculate the slope and the
y-intercept and then interpret the equation of the regression line.

Student No. of Days Absent Score in 50-point Math Quiz


(𝑥) (𝑦)

A 1 47
B 2 40
C 3 35
D 4 27
E 5 15

Directions: Answer the following questions.

11 SDOIN__CORE_Q4_Stat_and_Prob_Module19
1. What are the two things you should do before you start finding the equation
of the regression line?
2. What are the assumptions in conducting a regression?
3. If the value of the Pearson coefficient 𝑟 is found to be insignificant, what be the
expected result of the regression analysis?
4. What is the function of the slope 𝑏 in a regression line?

What I Can Do
Directions: Follow the instructions below.
1. Think of any pair of data (𝑥 𝑎𝑛𝑑 𝑦) that may appeal to you (e.g. age and number
of sleep hours, etc.).
2. Conduct an interview in your household (limit your respondents to at least five
(5) persons) by recording their respective responses to your chosen data.
3. Present the results in tabular form and find the corresponding equation of the
regression line.
4. Provide an interpretation of the results.

Assessment
Directions: Read and understand each item carefully. Write the letter that
corresponds to the correct answer on your activity notebook.
1. In the equation 𝑦' = 5 + 6𝑥 , what is the slope?
A. 6𝑥 B. 𝑦' C. 5 D. 6
2. In the equation 𝑦' = 12 − 6𝑥 , what is the value of the y-intercept?
A. − 12 B. − 6 C. 12 D. 𝑦'
3. Which of the following scenarios could give you a meaningful regression
analysis?
A. The value of 𝑟 is not significant.
B. Correlation will be done after the regression analysis.
C. There is no linear relationship between the variables.
D. There is a strong negative linear relationship between the variables.
4. In a regression line, how do you call the magnitude of the change in one
variable when the other variable changes at a unit?
A. Marginal change C. Unit change
12 SDOIN__CORE_Q4_Stat_and_Prob_Module19
B. Regression change D. Variable change
5. If the equation of the regression line is 𝑦' = 6 +. 234𝑥, how can it be interpreted?
A. The slope of the line is 234.
B. Every unit of change in the value of 𝑦, the value of 𝑥 also changes at 6
units on average.
C. Every unit of change in the value of 𝑥, the value of 𝑦 also changes at .234
unit on average.
D. Every unit of change in the value of 𝑥, the value of 𝑦 also changes at .234
unit on average.
6. What is the value of the equation 𝑦 = 4 + 5𝑥 if the value of 𝑥 is twice the slope?
A. 7 B. 11 C. 19 D. 54
7. If the linear regression equation is 𝑦' = 103 − 1. 7𝑥, what would be the value of
𝑦' when 𝑥 = 20?
A. 86 B. 93 C. 99.6 D. 101.3
8. Which of the following scenarios could not possibly give us an acceptable
prediction using the equation of regression?
A. Predicting the age of a student based on his/her grade.
B. Predicting the future company revenue based on past sales.
C. Predicting the crop yield of a farm depending on rainfall days.
D. Predicting the number of hospital patients based on the season.
9. In a study involving number of student tardy days and their corresponding quiz
scores, the resulting regression equation is 𝑦 = 34 +. 756𝑥. What would be the
corresponding score if a student never committed any tardy day?
A. 34 B. 41 C. 43 D. 44
10.In a study involving the number of assists in a basketball game (independent)
and the total points (dependent), find the total points of a game when the
number of assists is 30. Use 𝑦' = 2. 693 + 1. 962𝑥.
A. 61 B. 62 C. 63 D. 64

Additional Activities
Directions: Solve each problem carefully.
1. The following are the midterm and the final grades of 10 college students.
Determine the regression equation and interpret the results.
Student 1 2 3 4 5 6 7 8 9 10

13 SDOIN__CORE_Q4_Stat_and_Prob_Module19
Midterm 82 84 78 76 86 90 83 77 81 85
Grade
Final 83 81 80 75 88 90 85 75 80 82
Grade

2. An ice cream vendor decided to record the temperature for one week and the
number of gallons of ice cream he was able to sell. The data he recorded are
shown in the table below. Determine the regression equation and
interpret the result.
Day 1 2 3 4 5 6 7
Temperatur 31 26 30 29 28 30 25
e (in
Celsius)
Number of 10 6 8 6 4 10 4
Gallons

14 SDOIN__CORE_Q4_Stat_and_Prob_Module19
Answer Key

15 SDOIN__CORE_Q4_Stat_and_Prob_Module19
16 SDOIN__CORE_Q4_Stat_and_Prob_Module19
17 SDOIN__CORE_Q4_Stat_and_Prob_Module19
References
Alonzo, G. et al., 2017 Statistics and Probability for Senior High School.
Philippines. Salinlahi Publishing House, Inc.
Belencina, R. et al.,2016 Statistics and Probability. Philippines. Rex Book Store,
Inc.
Calaca.N, Uy, C. ,Noble, N.M.& Manalo. R. A. (2016). Statistics and Probability.
VIBAL BOOKSTORE.
Orinez, Fernando B., et al., 2016 Next Century Mathematics Statistics and
Probability. Philippines. Phoenix Publishing House.
"K To 12 Curriculum Guide in Statistics and Probability". 2016.
Deped.Gov.Ph.https://www.deped.gov.ph/wpcontent/uploads/2019/01/E
Statistics and Probability-CG.pdf.
Department of Education. "K to 12 Most Essential Learning Competencies with
Corresponding CG Codes". Pasig City: Department of Education Central
Office, 2020.
https://www.google.com/search?q=tdistribution&rlz=1C1GIGM_enPH855PH855&s
ource=lnms&tbm=isch&sa=X&ved=2ahUKEwif14es9cHsAhVaBogKHUDTAm8Q_AU
oAXoECCcQAw&biw=1366&bih=657#imgrc=zP2Fw-Nm68ZAnM
18 SDOIN__CORE_Q4_Stat_and_Prob_Module19
https://www.quora.com/How-do-the-t-distribution-and-standard-normal-distribut
ion-differ-in-terms-of-estimating-a-population-parameter-Why-do-we-use-the-t-dist
ribution-for-statistical-inference
https://link.quipper.com/en/organizations/547fdd11d11ff0002000b4/curriculum

19 SDOIN__CORE_Q4_Stat_and_Prob_Module19

You might also like