A CA5103 Module 2 Notes

MANSCI – MODULE 2
Professor: Prof. Mary Jane A. Castilla

Transcribed by: Tyrone Villena
TYPES OF FORECASTING MODELS AND SIMPLE LINEAR 2. Causal Models

REGRESION ANALYSIS - Are quantitative forecasting models wherein
the variable to be forecasted is influenced by
1. Qualitative models or correlated with other variables included in
- Based on judgmental or subjective factors the model.
(lacks historical data, relies on expert opinion - Include regression models and other more
and individual experiences) complex models
e.g., introducing a new product (forecasting e.g., daily sales of bottled water might depend
demand is difficult due to the lack of any on the average temperature, the average
historical sales data) humidity, and so on.
Four different qualitative forecasting 3. Time-Series Models

- Delphi method – this iterative group process - Are also quantitative forecasting
allows experts in different place to make models/techniques that attempt to predict the
forecasts; involves three different participants future values of a variable by using only
(decision makers – usually consists of 5 to 10 historical data on that one variable
experts, staff personnel – assist the decision - These models are extrapolations of past
makers, and respondents – group of people values of that series
whose judgement are valued and being
sought). e.g., using the past weekly sales for lawn
mowers in making the forecast for future
- Jury of executive opinion – a method which sales
takes the opinions of a small group of high-
level managers, often in combination with
statistical models, and results in a group Forecasting Techniques
estimate of demand.
1. Qualitative Models
- Sales force composite – an approach where - Delphi Method
each salesperson estimates what sales will be - Jury of Executive Opinion
in his or her region; these forecasts are - Sales Force Composite
reviewed and combined at the district and
national levels to reach an overall forecast. 2. Time-Series Methods
- Moving Averages
- Consumer market survey – a method that
- Exponential Smoothing
solicits input from customers or potential
- Trend Projections
customers about their future purchasing
- Decomposition
plans; help not only in preparing a forecast
but also in improving product design and 3. Causal Methods
planning for new products.
- Regression Analysis
- Multiple Regression
MANSCI – MODULE 2
Regression Analysis (Render et al., 2016) - Scatter plots showing various correlation
coefficient values:
- A forecasting technique with generally two
purposes:
1. To understand the relationship between two
variables;
2. To predict the value of one based on the other.
- Its’ applicability is virtually limitless (e.g., level of

education and income, price of a house and the
square footage, advertising expenditures and
sales, etc.).
- In Excel, use these functions to get the value of
- The variable predicted is called the dependent the correlation coefficient.
variable or response variable. The value of this 1. =CORREL(array1, array2)
response variable I said to be dependent upon the 2. =PEARSON(array 1, array 2)
value of an independent variable (explanatory
variable or predictor variable). - Test for significant correlation using Students t:
o The sample correlation coefficient r is an
Correlation Analysis estimate of the population correlation
coefficient ρ (Greek alphabet rho).
- The analysis of bivariate data typically begins with o There is no flat rule for a “high”
a scatter plot that displays each observed pair of correlation because sample size must be
data (x, y) as a dot on the x-y plane. taken into consideration.
o To test the hypothesis 𝐻0 : 𝜌 = 0, the test
- Correlation analysis is used to measure the statistic is
strength of the linear relationship between two
𝑛−2
variables. 𝑡𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 = 𝑟√ 2
𝑟
o Correlation is only concerned with
strength of the relationship. o After calculating this value, we can find its
o No causal effect is implied with p-value by using Excel’s function
correlation. =T.DIST.2T(t, deg_freedom).
- The sample correlation coefficient (like Pearson’s - The hypothesized relationship may be linear,
r and Spearman rho) measures the degree of quadratic or some other form.
linearity in the relationship between two random - Simple Linear Model is commonly referred to as
variables X and Y, with values in the interval simple regression equation.
[-1.1].
MANSCI – MODULE 2
Simple Linear Regression Equation
- The simple linear regression equation provides an

estimate of the population regression line
- The following formulas can be used to compute

the slope (𝑏1 ) and the intercept (𝑏0 ):
Σ𝑋
𝑋̅ = 𝑛
= Average (mean) of X values
Σ𝑌
𝑌̅ = = Average (mean) of Y values
𝑛
Σ(𝑋 − 𝑋̅)(𝑌 − 𝑌̅)

𝑏1 =
Σ(𝑋 − 𝑋̅)2
𝑏0 = 𝑌̅ − 𝑏1 𝑋̅
Interpreting an Estimated Regression Equation
- The slope tells us how much, in what direction,

the dependent or response variable will change
for each one unit increase in the predictor variable.
On the other hand, the intercept is meaningful
only if the predictor variable would reasonably
have a value equal to zero.
Simple Linear Regression Model
- Only one independent variable, X e.g., Sales = 268 + 7.36Ads

- The relationship between X and Y is described by
a linear function Each extra P1 million of advertising will generate
- The changes in Y are related to changes in X. P7.37 million sales on average. The firm would
average P268 million of sales with zero
The population regression model: advertising. However, the intercept may not be
meaningful because Ads = 0 may be outside the
range of observed data.
MANSCI – MODULE 2
Prediction Using Regression Standard Error of Estimate
- One of the main uses of regression is to make - The standard deviation of the variation of
predictions. Once we have fitted regression observations around the regression line is
equation that show the estimated relationship estimated by:
between X and Y, we can plug in any value of X 2
𝑆𝑆𝐸 ∑𝑛 (𝑌𝑖 − 𝑌̂𝑖 )
(within the range of our sample X values) to obtain 𝑆𝑌𝑋 =√ = √ 𝑖=1
𝑛−2 𝑛−2
the prediction for Y.
Assumptions of Regression (L.I.N.E.) Comparing Standard Errors
- Linearity – the relationship between X and Y is - 𝑆𝑌𝑋 is a measure of the variation of observed Y
linear values from the regression line.
- Independence of errors – the error values - The magnitude 𝑆𝑌𝑋 should always be judged
(difference between observed and estimated relative to the size of Y values in the sample data.
values) are statistically independent.
- Normality of Error – the error values are normally
distributed for any given value of X. Inferences about the slope using the t-test
- Equal variance or homoskedasticity – the
- The t-test for a population slope is used to
probability distribution of the errors has constant
determine if there is a linear relationship between
variance.
X and Y.
- Null and alternative hypotheses
𝐻0 : 𝛽1 = 0 (no linear relationship)
Assessing Fit: Coefficient of determination, 𝑅 2 𝐻1 : 𝛽1 ≠ 0 (linear
relationship does Where:
- The coefficient of determination is the portion of
exist) 𝑏1 = regression slope coefficient
the total variation in the dependent variable that is 𝛽1 = hypothesized slope
𝑏1 − 𝛽1
explained by the variation in the independent 𝑡= 𝑆𝑏1 = standard error of the slope
𝑆𝑏1
variable.
𝑑. 𝑓. = 𝑛 − 2
- It is also called r-squared and is obtained by:
𝑆𝑆𝑅 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒
𝑟2 = =
𝑆𝑆𝑇 𝑡𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
0 ≥ 𝑟2 ≤ 1
2
𝑆𝑆𝑅 = Σ(𝑌̂ − 𝑌̅)
𝑆𝑆𝑇 = Σ(𝑌 − 𝑌̅)2
2
𝑆𝑆𝐸 = Σ(Y − 𝑌̂)
- 𝑅 2 = 0.694 , 69.44% of the variation in

[dependent variable] is explained by the variation
in [dependent variable].
MANSCI – MODULE 2
specification with a different interpretation of

coefficient.
4. You could do nothing, just be aware of the
problem.
Residual Analysis for Independence of Errors:
Plot times series X against residuals
- Independence of errors means that the

distribution of errors is random and is not
influenced by or correlated to the errors in prior
observations.
Checking the assumptions by examining the residuals
Residual Analysis for Normality:
1. Examine the Stem-and-Leaf Display of the Residuals

- Clearly, independence can be checked when we
2. Examine the Box-and-Whisker Plot of the Residuals know the order in which observations were made.
The opposite of independence is auto-correlation.
3. Examine the Histogram of the Residuals
4. Construct a normal probability plot.
5. Construct a Q-Q plot.
If residuals are normal, the probability plot and the Q-

Q plot should be approximately linear. o Measuring Auto-correlation
▪ Another way of checking for
What can we do when residuals are not normal? independence of errors is by testing
the significance of the Durbin-Watson
1. Consider trimming outliers – but only if they
Statistic.
clearly are mistakes.
▪ The Durbin-Watson Statistic
2. Can you increase the sample size? If so, it will help
measures and detects the presence
assure asymptotic normality of the estimates.
of auto-correlation.
3. You could try a logarithmic transformation of the
variables. However, this is a new model
MANSCI – MODULE 2
▪ Auto-correlation exists if residuals in

one time period are related to
residuals in another period.
▪ The presence of auto-correlation of
errors (or residuals) violates the
regression assumption that residuals
are statistically independent.
Strategies when performing regression analysis
- Start with a scatter plot of X on Y to observe

possible relationship.
- Perform residual analysis to check the
assumptions.
o Plot the residuals vs X to check for
violations of assumptions such as equal
variance.
o Use a histogram, stem and leaf display,
box and whisker plot or normal
probability plot of the residuals to
uncover possible non-normality.
- If there is any violation of any assumption, use
alternative methods or models.
- If there is no evidence of assumption violation,
then test for the significance of the regression
coefficients.
- Avoid making predictions or forecasts outside the
relevant range.

A CA5103 Module 2 Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A CA5103 Module 2 Notes

Uploaded by

Copyright:

Available Formats

MANSCI – MODULE 2

Professor: Prof. Mary Jane A. Castilla

TYPES OF FORECASTING MODELS AND SIMPLE LINEAR 2. Causal Models

Four different qualitative forecasting 3. Time-Series Models

- Its’ applicability is virtually limitless (e.g., level of

Simple Linear Regression Equation

- The simple linear regression equation provides an

- The following formulas can be used to compute

Σ(𝑋 − 𝑋̅)(𝑌 − 𝑌̅)

Interpreting an Estimated Regression Equation

- The slope tells us how much, in what direction,

- Only one independent variable, X e.g., Sales = 268 + 7.36Ads

Prediction Using Regression Standard Error of Estimate

Assumptions of Regression (L.I.N.E.) Comparing Standard Errors

- 𝑅 2 = 0.694 , 69.44% of the variation in

specification with a different interpretation of

Residual Analysis for Independence of Errors:

Plot times series X against residuals

- Independence of errors means that the

Checking the assumptions by examining the residuals

Residual Analysis for Normality:

1. Examine the Stem-and-Leaf Display of the Residuals

4. Construct a normal probability plot.

5. Construct a Q-Q plot.

If residuals are normal, the probability plot and the Q-

▪ Auto-correlation exists if residuals in

Strategies when performing regression analysis

- Start with a scatter plot of X on Y to observe

You might also like