Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

TYPES OF FORECASTING MODELS

AND SIMPLE LINEAR REGRESSION


ANALYSIS
Learning Objectives:
At the end of the lesson, the students should be able to:
• Distinguish the various types of forecasting models;
• Differentiate between correlation analysis and regression analysis;
• Interpret a regression equation and use it to make predictions;
• Interpret the meaning of the regression coefficients b0 and b1;
• Explain the least squares method and interpret R2;
• Interpret the regression results in Excel and;
• Evaluate the assumptions of regression analysis and know what to do
it the assumptions are violated.
Types of Forecasting Models (Render et al., 2016)

1.Qualitative Models
-based on judgmental or subjective factors (lacks historical
data, relies on expert opinion and individual experiences)

- example: introducing a new product (forecasting demand


is difficult due to lack of any historical sales data)
Four different qualitative forecasting

▪ Delphi method – this iterative group process allows


experts in different places to make forecasts; involves
three different participants (decision makers – usually
consists of 5 to 10 experts, staff personnel – assist the
decision makers, and respondents – group of people
whose judgement are valued and being sought).
Four different qualitative forecasting
▪ Jury of executive opinion – a method which takes the
opinions of a small group of high-level managers, often in
combination with statistical models, and results in a group
estimate of demand.

▪ Sales force composite – an approach where each


salesperson estimates what sales will be in his or her
region; these forecasts are reviewed and combined at the
district and national levels to reach an overall forecast.
Four different qualitative forecasting
▪ Consumer market survey – a method that solicits input
from customers or potential customers about their future
purchasing plans; helps not only in preparing a forecast but
also in improving product design and planning for new
products.
Types of Forecasting Models (Render et al., 2016)

2. Causal Models
- are quantitative forecasting models wherein the variable to
be forecast is influenced by or correlated with other
variables included in the model.

- include regression models and other more complex models

- example: daily sales of bottled water might depend on the


average temperature, the average humidity, and so on.
Types of Forecasting Models (Render et al., 2016)

3. Time-Series Models
- are also quantitative forecasting models/ techniques that
attempt to predict the future values of a variable by using
only historical data on that one variable.

-these models are extrapolations of past values of that series

-example: using the past weekly sales for lawn mowers in


making the forecast for future sales
Forecasting Models (Render et al., 2016)
Forecasting
Techniques

Qualitative Time-Series Causal Methods


Models Methods

Delphi Method Moving Regression


Averages Analysis

Jury of Exponential Multiple


Executive Smoothing Regression
Opinion
Sales Force Trend
Composite Projections

Consumer Decomposition
Market Survey
Regression Analysis (Render et al., 2016)
▪ A forecasting technique with generally two purposes:
1) to understand the relationship between two variables;
2) to predict the value of one based on the other.

▪ Its’ applicability is virtually limitless (e.g., level of education and


income, price of a house and the square footage, advertising
expenditures and sales, etc.).

▪ The variable predicted is called the dependent variable or


response variable. The value of this response variable is said
to be dependent upon the value of an independent variable
(explanatory variable or predictor variable).
Correlation Analysis
▪ The analysis of bivariate data typically begins with a scatter plot
that displays each observed pair of data (x, y) as a dot on the x-y
plane.
▪ Correlation analysis is used to measure the strength of the linear
relationship between two variables.
▪ Correlation is only concerned with strength of the relationship.
▪ No causal effect is implied with correlation.
▪ The sample correlation coefficient (like Pearson’s r and
Spearman rho) measures the degree of linearity in the
relationship between two random variables X and Y, with values
in the interval [-1, 1].
Statistical Analysis with Software Applications, Mc Graw Hill
Correlation Analysis
Scatter plots showing
various correlation
coefficient values

In Excel, use these functions


to get the value of the
correlation coefficient.
1. =CORREL(array1, array2)
2. =PEARSON(array1,
array2)

Statistical Analysis with Software Applications, Mc Graw Hill


Correlation Analysis

Statistical Analysis with Software Applications, Mc Graw Hill


Correlation Analysis

Statistical Analysis with Software Applications, Mc Graw Hill


Regression Analysis
▪ The hypothesized relationship may be linear, quadratic or some other
form.
▪ The next slide presents some of the possible patterns.
▪ The module will focus on the simple linear model commonly referred
to as a simple regression equation.

Statistical Analysis with Software Applications, Mc Graw Hill


Regression Analysis: Types of relationships

Source: Statistics for Manager Using Microsoft Excel, 5e @ 2008 Prentice-Hall, Inc
Regression Analysis: Types of relationships

Source: Statistics for Manager Using Microsoft Excel, 5e @ 2008 Prentice-Hall, Inc
Regression Analysis: Types of relationships

Source: Statistics for Manager Using Microsoft Excel, 5e @ 2008 Prentice-Hall, Inc
Simple Linear Regression Model

▪ Only one independent variable, X


▪ The relationship between X and Y is described by a
linear function.
▪ The changes in Y are related to changes in X.

Statistical Analysis with Software Applications, Mc Graw Hill


The population regression model

Statistical Analysis with Software Applications, Mc Graw Hill


Simple Linear Regression Equation

Statistical Analysis with Software Applications, Mc Graw Hill


Simple Linear Regression Equation (Render et al., 2016)

The following formulas can be used to compute the


slope (b1) and the intercept (b0):
𝑋
𝑋= = Average (mean) of X values
𝑛
𝑌
𝑌= = Average (mean) of Y values
𝑛
𝑋−𝑋 𝑌−𝑌
𝑏1 =
𝑋−𝑋 2

𝑏0 = 𝑌 − 𝑏1 𝑋
Interpreting an Estimated Regression Equation
The slope tells us how much, and in what direction, the dependent or response
variable will change for each one unit increase in the predictor variable. On the
other hand, the intercept is meaningful only if the predictor variable would
reasonably have a value equal to zero.
Equation:

Interpretation:
Each extra P1 million of advertising will generate P7.37 million of sales on average.
The firm would average P268 million of sales with zero advertising. However, the
intercept may not be meaningful because Ads = 0 may be outside the range of
observed data.

Statistical Analysis with Software Applications, Mc Graw Hill


Interpreting an Estimated Regression Equation
Other examples:

Statistical Analysis with Software Applications, Mc Graw Hill


Prediction Using Regression
One of the main uses of regression is to make predictions. Once we have a fitted
regression equation that show the estimated relationship between X and Y, we can
plug in any value of X (within the range of our sample x values) to obtain the
prediction for Y.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1: Triple A Construction Company renovates old homes in
Albany. Over time, the company has found that its dollar
volume of renovation work is dependent on the Albany
area payroll. The figures for Triple A’s revenues and the
amount of money earned by wage earners in Albany for
the past 6 years are presented in table below.
LOCAL PAYROLL ($100,000,000s) TRIPLE A’S SALES ($100, 000s)
3 6
4 8
6 9
4 5
2 4.5
5 9.5
Solution: To investigate the relationship between variables, it
is helpful to look at the scatter diagram or scatter
plot below.

Note: The graph indicates that higher values for the local payroll
seem to result in higher sales for the company. It is not a
perfect relationship because not all the points lie in a
straight line, but there is a relationship.
Regression Calculations for Triple A Construction:
X Y 𝑋−𝑋 2 𝑋−𝑋 𝑌−𝑌
3 6 1 1
4 8 0 0
6 9 4 4
4 5 0 0
2 4.5 4 5
5 9.5 1 2.5

෍ 𝑋 = 24 ෍ 𝑌 = 42 ෍ 𝑋−𝑋 2 ෍ 𝑋−𝑋 𝑌−𝑌

= 10 = 12.5

24 42
𝑋= =4 𝑌= =7
6 6
Computing the slope and the intercept of the
regression equation, we have:

𝑋 24
𝑿= = =4
𝑛 6 Note: Each time a payroll
𝑌 42 increases by $100 million
𝒀= = =7
𝑛 6 (represented by X), we
𝑋−𝑋 𝑌−𝑌 12.5 expect the sales to
𝒃𝟏 = 2 = = 1.25 increase by $125,000
𝑋−𝑋 10
since b1 = 1.25($100,000).
𝒃𝟎 = 𝑌 − 𝑏1 𝑋 = 7 – (1.25) (4) = 2
The estimated regression equation therefore is:
𝑌෠ = 2 + 1.25X or Sales = 2 + 1.25 (Payroll)
Assumptions of Regression (L.I.N.E)
▪ Linearity – the relationship between X and Y is linear
▪ Independence of errors – the error values (difference
between observed and estimated values) are statistically
independent.
▪ Normality of error – the error values are normally
distributed for any given value of X
▪ Equal variance or homoskedasticity – the probability
distribution of the errors has constant variance.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1 Excel output
For the scatterplot:
1. Highlight X array
and Y array.
2. Choose Insert.
3. Choose Scatter
among the chart
types available.
4. Edit the axis labels.
Example 1 Excel output
For regression:
1. Go to Data, choose
Data Analysis.
2. Choose Regression
among the Data
Analysis Tools.
3. Fill up necessary
fields.
4. Click OK.
Example 1 Excel output

The estimated regression equation is:


Sales = 2 + 1.25 * (Payroll)
Example 1 Excel output

Intercept = 2
Assessing Fit: Coefficient of determination, R2

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1 Coefficient of determination

𝑺𝑺𝑹 𝟏𝟓. 𝟔𝟐𝟓


𝒓𝟐 = = = 𝟎. 𝟔𝟗𝟒
𝑺𝑺𝑻 𝟐𝟐. 𝟓
69.44% of the variation in sales
is explained by the variation in
payroll.
Standard Error of Estimate
▪ The standard deviation of the variation of observations
around the regression line is estimated by:

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1 Standard error of estimate

𝑺𝒀𝑿 = 1.311
Comparing Standard Errors

Statistical Analysis with Software Applications, Mc Graw Hill


Inferences about the slope using the t-test
▪ The t-test for a population slope is used to determine if there
is a linear relationship between X and Y.
▪ Null and alternative hypotheses
H0: β1 = 0 (no linear relationship)
H1: β1 ≠ 0 (linear relationship does exist)

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1 Inferences about the slope

The estimated regression equation is:


Sales = 2 + 1.25 * (Payroll)

The slope of this model is 1.25. Is


there a relationship between the
payroll and the sales?
Example 1 Excel output

𝟏.𝟐𝟓−𝟎
t= = 3.0151
.𝟒𝟏𝟒𝟓𝟖
df = n – 2 = 6 – 2 = 4
𝒃𝟏
T.DIST.2T(3.0151,4) = .039
Example 1 Inference about slope
T.DIST.2T(3.0151,4) = .039
= p-value

• H0: β1 = 0
• H1: β1 ≠ 0

Reject the null hypothesis since p < α.

There is sufficient data to conclude that there is a statistically


significant relationship between the payroll and the sales.
Checking the assumptions by examining the residuals

Residual Analysis for


Linearity:
Plot X against residuals

Aside from visually examining the scatter plots of the IV and DV to assess linearity, the
scatter plot of the IV versus the residuals may also be examined. The plots at the left show
curve patterns which indicates that the data relationship is not linear. Another model should
be used.
Statistical Analysis with Software Applications, Mc Graw Hill
Checking the assumptions by examining the residuals

Residual
Analysis for
Equal Variance:
Plot X against
residuals

Statistical Analysis with Software Applications, Mc Graw Hill


Checking the assumptions by examining the residuals

Residual
Analysis for
Equal variance:
Plot predicted
values against
residuals

Statistical Analysis with Software Applications, Mc Graw Hill


Checking the assumptions by examining the residuals
Residual Analysis for Normality:
1. Examine the Stem-and-Leaf Display of the Residuals
2. Examine the Box-and-Whisker Plot of the Residuals
3. Examine the Histogram of the Residuals
4. Construct a normal probability plot.
5. Construct a Q-Q plot.

Statistical Analysis with Software Applications, Mc Graw Hill


Checking the assumptions by examining the residuals

If residuals are normal, the probability plot


and the Q-Q plot should be approximately
linear.

Statistical Analysis with Software Applications, Mc Graw Hill


Checking the assumptions by examining the residuals

What can we do when residuals are not normal?


1. Consider trimming outliers – but only if they clearly are
mistakes.
2. Can you increase the sample size? If so, it will help assure
asymptotic normality of the estimates.
3. You could try a logarithmic transformation of the variables.
However, this is a new model specification with a different
interpretation of coefficients.
4. You could do nothing, just be aware of the problem.
Statistical Analysis with Software Applications, Mc Graw Hill
Checking the assumptions by examining the residuals

Residual Analysis for


Independence of Errors:
Plot times series X against
residuals

Independence of errors means that the


distribution of errors is random and is not
influenced by or correlated to the errors
in prior observations.

Statistical Analysis with Software Applications, Mc Graw Hill


Checking the assumptions by examining the residuals

Residual Analysis for


Independence of Errors:
Plot times series X against
residuals

Clearly, independence can be checked


when we know the order in which the
observations were made. The opposite of
independence is auto-correlation.

Statistical Analysis with Software Applications, Mc Graw Hill


Measuring Autocorrelation
▪ Another way of checking for independence of errors is by
testing the significance of the Durbin Watson Statistic.
▪ The Durbin-Watson Statistic measures detects the presence
of autocorrelation.
▪ It is used when data are collected over time to detect the
presence of autocorrelation.
▪ Autocorrelation exists if residuals in one time period are
related to residuals in another period.

Statistical Analysis with Software Applications, Mc Graw Hill


Measuring Autocorrelation
▪ The presence of autocorrelation of errors (or residuals)
violates the regression assumption that residuals are
statistically independent.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1 Excel output for assessing assumptions
Payroll (X) in $100,000,000s The residual plot
Residual Plot
2
1
shows that the

Residuals
0
-1 0 1 2 3 4 5 6 7
assumptions of
-2
-3
linearity and constant
Payroll (X) in $100,000,000s
variance are satisfied.

Normal Probability Plot The assumption of


10
normality of residuals
Sales (Y) in $100,000s

8
6
4
is satisfied since the
2 points follow a straight
0
0 20 40 60
Sample Percentile
80 100 line.

Statistical Analysis with Software Applications, Mc Graw Hill


Strategies when performing regression analysis
▪ Start with a scatter plot of X on Y to observe possible
relationship.
▪ Perform residual analysis to check the assumptions.
▪ Plot the residuals vs X to check for violations of
assumptions such as equal variance.
▪ Use a histogram, stem and leaf display, box and whisker
plot or normal probability plot of the residuals to uncover
possible non-normality.

Statistical Analysis with Software Applications, Mc Graw Hill


Strategies when performing regression analysis
▪ If there is any violation of any assumption, use alternative
methods or models.
▪ If there is no evidence of assumption violation, then test for
the significance of the regression coefficients.
▪ Avoid making predictions or forecasts outside the relevant
range.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2.
What is the relationship between the number of hours a student
studies and his or her exam score? Shown in the table are the
data for 10 students.
Student Hours, X Score, Y Student Hours, X Score, Y

1 1 53 6 11 84

2 5 74 7 14 96

3 7 59 8 15 69

4 8 43 9 15 84

5 10 56 10 19 83

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Excel output
For the scatterplot:
1. Highlight X array and
Y array.
2. Choose Insert.
3. Choose Scatter among
the chart types
available.
4. Edit the axis labels.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Excel output
For regression:
1. Go to Data, choose
Data Analysis.
2. Choose Regression
among the Data
Analysis Tools.
3. Fill up necessary
fields.
4. Click OK.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Excel output

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Excel output

Intercept = 49.477

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Interpretation of coefficients

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Coefficient of determination

39.41% of the variation in scores


is explained by the variation in
study hours.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Standard error of estimate

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Inferences about the slope

Statistical Analysis with Software Applications, Mc Graw Hill


Example 1 Excel output

1.9641 −0
𝑡= = 𝟐. 𝟐𝟖𝟏𝟐𝟐𝟏
.8610

T.DIST.2T (2.281221,8) = 0.052

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Inference about slope
T.DIST.2T (2.281221,8) = 0.052
= p-value

• H0: β1 = 0
• H1: β1 ≠ 0

Do not reject the null hypothesis since p > α.

There is no sufficient evidence that study hours affects exam scores.

Statistical Analysis with Software Applications, Mc Graw Hill


Example 2 Excel output for assessing assumptions
The residual plot shows
that the assumptions of
linearity and constant
variance are satisfied.

The assumption of
normality of residuals is
satisfied since the points
follow a straight line.

Statistical Analysis with Software Applications, Mc Graw Hill


Exercise:
Exercise:

Source: Render, B., & Stair Jr, R. M. (2016). Quantitative Analysis for Management, 12e. Pearson Education India
References

Render, B., & Stair Jr, R. M. (2016). Quantitative Analysis for


Management, 12e. Pearson Education India.

Statistical Analysis with Software Applications, Mc Graw Hill.

You might also like