STAT Q4 Week 9 Enhanced.v1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Quarter 2 - Week 9

0
Let’s Learn

This module was designed and written to help you understand statistics and
probability specifically in determining a regression line. This lesson provides a
review discussion of correlation, identifying variables, as part of the pre-
assessment. The lesson also provides an opportunity for students to learn the
material in different ways, including visuals, and symbols. The lesson gives options
for flexible use of time and gives an introduction on lines of regression.

This module contains


➢ Lesson 1 – Regression Analysis

After going through this module, you are expected to:


1. identify the independent and dependent variables
2. calculate the slope and y-intercept of the regression line
3. interpret the calculated slope and y-intercept of the regression line

Let’s Try

Directions: Choose the letter of the best answer. Write the chosen letter on a
separate sheet of paper.

1. Which of the following determines if there exists a linear relationship between


two variables?
A. correlation coefficient C. regression
B. coefficient of variation D. none of these

2. Which correlation coefficient represents the strongest linear relationship between


two variables?
A. 0 B.0.5 C.-0.89 D.0.95

3. Which of the following correlation coefficients indicates a perfect linear


relationship?
A. -1 B.1 C. 0 D. both A and B
4. What does the value 𝑟 = 0.95 represent about the relationship between two
variables?
A. negative linear relationship C. positive linear relationship
B. no linear relationship D. none of the above

5. Determine the equation of the regression line.


A. 𝑌 ′ = 1.12𝑋 + 11.83 C. 𝑌′ = −11.83𝑋 + 1.12
B. 𝑌′ = 11.83 + 1.12𝑋 D. 𝑌′ = 1.12𝑋 – 11.83

1
For numbers, 6-10. Identify the dependent and independent variables.

6. A study is done to determine if elderly drivers are involved in more motor vehicle
fatalities than other drivers. The number of fatalities per 100,000 drivers is
compared to the age of drivers.

7. A study is done to determine if the weekly grocery bill changes based on the
number of family members.

8. Insurance companies’ base life insurance premiums partially on the age of the
applicant.

9. Utility bills vary according to power consumption.

10. A study is done to determine if a higher education reduces the crime rate in a
population.

Lesson
Regression Analysis
1

Let’s Recall

WARM-UP ACTIVITY!

A. Transform each linear equation into the form y= mx +b.

1. 4x + 3y = 10
2. 2y + 3x = 7
3. 2x − y = 8 y
4. x − 4y = 6
5. x−y = 5
6

4
B. Use a graph paper to graph the
2
following linear equations.
x
-6 -4 -2 0 2 4 6
6. y = 2x + 5
1
7. y = x + 3 -2
2
8. y = −2x + 3
-4
9. y = 6 – 3x
10. y = − 4x – 2 -6

2
From above equations, it very helpful to you to use technology or apps like
Desmos for graphing. It makes easier and less time consuming. It makes your
work faster and accurate. You may download the apps in your gadgets.

Let’s Elaborate

Regression analysis is a statistical technique used for determining the


functional form of the relationships between two or more variables, where one
variable is called the dependent variable (response variable Y) and the other
variable is called the independent variable (concomitant variable X). An objective of
regression analysis is to be able to predict or estimate the value of the response
variable given the values of the independent variable. In other words, regression
analysis is concerned with the problem of estimation and forecasting.

In algebra, we learned that relationships are described by using graphs. If we


are given the value of 𝑋, we can compute and predict the value of 𝑌. But in
statistics, relationships are not that easy to predict. For instance, the height of a
person has an influence on his or her weight. However, there are other factors that
affect the weight of a person such as age, sex, body structure, environmental
factors, and genetic issues.

Simple linear regression describes the relationship between the dependent


variable Y and the independent variable 𝑋 using a linear equation known as the
simple linear regression equation.

How to identify independent variables from dependent variables?

Independent Variable is a variable that represents a quantity that is being


manipulated in an experiment.
x-is often the variable used to represents the independent variable in an
equation.
Dependent variable is a variable represents a quantity whose value depends
on how the independent variable is manipulated.
y-is often the variable used to represents the dependent variable in an
equation.

Let us try this. Identify the independent variable and dependent variable.

Example1
You are doing chores to earn your allowance. For each chore you do, you
earn Php150.

Solution

3
Independent Variable (x) – the number of chores you do because this is the
variable you must control over.
Dependent Variable (y) – the amount of money you earn because the amount
of money you earn depends on how many chores you do.

Tip: If you have trouble figuring out which of your variables is the independent one
and which is the dependent one, try inserting the variables into the following
sentence:

“(Independent Variable) causes a change in (Dependent Variable) and it is not


possible that (Dependent Variable) could change in (Independent Variable)

Taking the example above

“The amount of money you earn depends on the number of chores you do”.

How do we find a Linear Regression Equation?

One way in describing relationship is by using scatterplots. When a


correlation coefficient shows that data is likely to be able to predict future
outcomes and a scatterplot of the data appears to form a straight line, you can use
simple linear regression to find a predictive function. If you recall from elementary
algebra, the equation for a line is y = mx + b. In simple linear regression, the
equation is 𝑌 ′ = 𝑏𝑋 + 𝑎.

• The slope of a line is the change in Y over the change in X. For example, a
5
slope of means as the x-value increases (moves right) by 2 units, the y-
2
value moves up by 5 units on average.

• The y-intercept is the value on the y-axis where the line crosses. For
example, in the equation 𝑦 = 5𝑥 – 6, the line crosses the 𝑦 − 𝑎𝑥𝑖𝑠 at the value
𝑏 = – 6. The coordinates of this point are (0, –6); when a line crosses the 𝑦 −
𝑎𝑥𝑖𝑠, the 𝑥 − 𝑣𝑎𝑙𝑢𝑒 is always 0.

Always calculate the slope before the 𝑦 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡. The formula for the y-
intercept contains the slope!

Note: If you are taking Advance Placement statistics, you may see the
equation written as 𝑌 ′ = 𝑏0 + 𝑏1 𝑥, which is the same thing (you are just using the
variables b0 and b1 instead of a and b).

Example 1

Given the data below, find the slope and y-intercept.

Student Number of Missed Number of


Quizzes absences
1 0 1

4
2 1 1
3 1 1
4 5 6
5 2 3
6 3 3
7 4 5
8 3 4
9 4 4
10 1 1

Steps Solution
1. Identify the dependent and In this case X is number of absences and Y
independent variables is the number of quizzes that the student
missed.
X Y X2 Y2 XY
2. Find the values of a and b 1 0 1 0 0
1 1 1 1 1
1 1 1 1 1
6 5 36 25 30
3 2 9 4 6
3 3 9 9 9
5 4 25 16 20
4 3 16 9 12
4 4 16 16 16
1 1 1 1 1
∑𝑋 ∑𝑌 2 2
∑𝑋 = ∑𝑌 =82 ∑𝑋𝑌=96
= 29 = 24 115

(∑ 𝑌)(∑ 𝑋2 ) − (∑ 𝑋)(∑ 𝑋𝑌)


𝑎=
𝑛(∑ 𝑋2 ) − (∑ 𝑋)2

(24)(115) − 29(96)
𝑎=
10(115) − 292

𝑎 = −0.08

𝑛(∑ 𝑋𝑌) − ( ∑ 𝑋)(∑ 𝑌)


𝑏=
𝑛(∑ 𝑋2 ) − (∑ 𝑋)2

𝑛(∑ 𝑋𝑌) − ( ∑ 𝑋)(∑ 𝑌)


𝑏=
𝑛(∑ 𝑋2 ) − (∑ 𝑋)2

10(96)−29(24)
𝑏= 10(115)−292

𝑏 = 0.85

5
3. Form the regression equation
𝑌 ′ = 0.85𝑋 − 0.08
𝑆𝑙𝑜𝑝𝑒 = 0.85, and 𝑦 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 = −0.08

4. Interpret the calculated slope and As seen the equation if a student tends to
y-intercept have more absences, he/she will have more
chances to miss the quizzes.

Note: To make it easier you may use the Microsoft Excel or a Scientific Calculator
in finding the regression equation or you can view videos to understand better by
using the link below.

Let’s Dig In

Given the statement identify the independent and dependent variable

1. A study finds that reading levels are affected by whether a person is born in a
foreign country.

2. You are studying how tutoring affects NAT score.

Let’s Remember

The following are the key points of the topic:


➢ Independent variable is what you change while the dependent variable is
what changes. You can also think that independent variable as the cause
and the dependent variable as the effect.
➢ A regression line is a line that best fits the trend of a given data.
➢ We determine the regression in the form 𝑌 ′ = 𝑏𝑋 + 𝑎
➢ The use of the regression line is to describe the interrelation of a dependent
(Y variable) with one or many independent variables (X variable)

6
Let’s Apply

Find the slope and y-intercept of the given data, and then interpret the
calculated slope and y-intercept.

Expenditures 5 6 7 7.5 6.5 5.5 4.8 6.3 7.8 8


(X)
Sales(Y) 10 11 12 14 13 11 15 16 15 10
Complete the table.

Steps Solution
1. Identify the dependent and
independent variables
2. Find the slope and the y-intercept. X Y X2 Y2 XY
5 10
6 11
7 12
7.5 14
6.5 13
5.5 11
4.8 15
6.3 16
7.8 15
8 10
∑𝑋 ∑𝑌 ∑𝑋2 = ∑𝑌 2 = ∑𝑋𝑌=
= =

(∑ 𝑌)(∑ 𝑋2 ) − (∑ 𝑋)(∑ 𝑋𝑌)


𝑎=
𝑛(∑ 𝑋2 ) − (∑ 𝑋)2
𝑛(∑ 𝑋𝑌)−( ∑ 𝑋)(∑ 𝑌)
b=
𝑛(∑ 𝑋 2 )−(∑ 𝑋)2

3. Form the regression equation. 𝑌 ′ =bX+ a

4. Interpret the calculated slope and


y-intercept

7
Let’s Evaluate

Directions: Choose the letter of the best answer. Write the chosen letter on a
separate sheet of paper.
1. Which of the following indicates a fairly strong relationship between two
variables?
A. Correlation coefficient equals to 0.9
B. The p-value for the null hypothesis Beta coefficient = 0 is 0.0001
C. The t-statistic for the null hypothesis Beta coefficient = 0 is 30
D. None of these

2. To test linear relationship of continuous variables y(dependent) and


x(independent) continuous variables, which of the following plot is appropriate?
A. Scatter plot C. Histograms
B. Bar Chart D. Line graph

3. Which of the following method is used for predicting continuous dependent


variable?

1. Linear Regression 2. Logistic Regression


A. 1 and 2 C. 2 only
B. 1 only D. None of these

4. How many coefficients do you need to estimate in a simple linear regression


model (One independent variable)?

A. 1 B. 2 C. 3 D. none of these
5. If two variables are correlated, is it necessary that they have a linear
relationship?

A. Yes B. No C. Maybe D. None of these


7. Which ordered pair of variables corresponds to the ordered pair (dependent,
independent)?

I. (job performance, academic performance)


II. (academic performance, intelligence)
A. I only B. II only C. both I and II D. neither

8. You are buying boxes of pencils at National Book Store. Each box of pencils cost
Php45. Which of the following statements are true?
A. The dependent variable is the number of boxes of pencils you buy.
B. The independent variable is the number of boxes of pencils you buy.
C. The dependent variable is the amount of money you spend on the
pencils.

8
D. The independent variable is the amount of money you spend on the
pencils.

For numbers 9-10. Identify the dependent variable.


9. You are conducting an experiment to see if exposure to more sunlight increases
happiness levels for workers who typically spend the entire day in windowless
offices.
10. An experiment in a climate-controlled greenhouse concludes that water level
fertilizer and nutrient level in soil affects how tall plants grow. Plants grew an
average of 12″ taller if treated with optimal resources.
For numbers 11-15.

Identify the variable, calculate the slope and y-intercept, and then interpret
the result.
Lisa and her friends are going on vacation to Boracay which is 430kilometer
away from Manila. To pass the time, she recorded the hours spent in driving and
the distance traveled.

Hours Traveled Distanced Traveled


1 85
2 100
3 125
4 150
5 200
6 230
7 280
11. Identify the Independent and dependent variable.

12. What is the slope?

13. What is y-intercept?


14. What is the regression form?

15. Interpret the result.

References
Francis Joseph H. Campena, HIGH SCHOOL STATISTICS pp.206-224
Rene R. Belecina,Elisa S. Baccay, Efren B. Mateo, STATISTICS AND PROBABILITY
Basic Probability and Statistics by Winston S. Sirug
Statistics and Probability by Rene R. Belecina, Elisa S. Baccay, and Efren B. Mateo
Statistics and Probability by Lino C. Reynoso, Recto Y. Ponciano, and Belinda T.
Conde

9
Development Team of the Module

Writers: DIANA JANE A. ALFARO

Editors:
Content: LAMBERT QUESADA
Language: AILEEN GENOSO

Reviewers: MRS. MIRASOL I. RONGAVILLA


MR. ARMANDO V. EROLIN

Illustrators:
Layout Artist:
Management Team: DR. MARGARITO B. MATERUM, SDS
DR. GEORGE P. TIZON, SGOD-Chief
DR. ELLERY G. QUINTIA, CID Chief
MRS. MIRASOL I. RONGAVILLA, EPS - MATH
DR. DAISY L. MATAAC, EPS – LRMS/ ALS

For inquiries, please write or call:

Schools Division of Taguig city and Pateros Upper Bicutan Taguig City

Telefax: 8384251

Email Address: sdo.tapat@deped.gov.ph

10

You might also like