Basic Estimation Techniques

BASIC ESTIMATION TECHNIQUES
IN FORECASTING
5
INTENDED LEARNING OUTCOMES
1. Identify the different forecasting techniques.

2. Write a simple and multiple linear regression model.
3. Apply regression analysis as a basic forecasting technique.
4. Interpret regression statistics.
This portion introduces the basic estimation techniques that can be employed to be able to
forecast a certain key variable relevant for managerial decision-making. We begin by examining sources
of information that provide data for forecasts. These include consumer interviews and surveys,
controlled market studies, and uncontrolled market data. Next, we explore regression analysis, a
statistical method widely used in demand estimation. Finally, we consider a number of important
forecasting methods.
COLLECTING DATA
A. Consumer Surveys
A direct way to gather information is to ask people. Whether face to face, by telephone, online,
or via direct mail, researchers can ask current and prospective customers a host of questions like: How
much of the product do you plan to buy this year? What if the price increased by 10 percent?
Survey Pitfalls
Though useful, surveys have problems and limitations such as the following:
1. Market researchers may ask the right questions, but of the wrong people. Economists call this sample
bias. In some contexts, random sampling protects against sample bias. In other cases, surveys must take
care in targeting a representative sample of the relevant market segment.
2. A second problem is response bias. Respondents might report what they believe the questioner wants
to hear. (“Your product is terrific, and I intend to buy it this year if at all possible.”) Alternatively, the
customer may attempt to influence decision making. (“If you raise the price, I definitely will stop buying.”)
Neither response will likely reflect the potential customer’s true preferences.
3. A third problem is response accuracy. Even if unbiased and forthright, a potential customer may have
difficulty in answering a question accurately. (“I think I might buy it at that price, but when push comes
to shove, who knows?”) Potential customers often have little idea of how they will react to a price
increase or to an increase in advertising.
4. A final difficulty is cost. Conducting extensive consumer surveys is extremely costly. As in any
economic decision, the costs of acquiring additional information must be weighed against the benefits.
B. Controlled Market Studies
Firms can also generate data on product demand by selling their product in several smaller markets while
varying key demand determinants, such as price, across the markets. The firm might set a high price with
high advertising spending in one market, a high price and low advertising in another, a low price and high
advertising in yet another, and so on. By observing sales responses in the different markets, the firm can
learn how various pricing and advertising policies (and possible interactions among them) affect demand.
Market studies typically generate cross-sectional data—observations of economic entities (consumers
or firms) in different regions or markets during the same time period. Another type of market study relies
on time-series data.
C. Uncontrolled Market Data
With uncontrolled markets, however, many factors change at the same time. How, then, can a firm judge
the effect of any single factor? Fortunately, statisticians have developed methods to handle this very
problem. Internet purchases provide an expanding universe of additional data on consumer preferences
and purchasing behavior. Gathering this (relatively uncontrolled) data is quick and cheap—as little as one-
tenth the cost of controlled market tests. Today, using computers featuring massively parallel processors
and neural networks, companies can search through and organize millions of pieces of data about
customers and their buying habits, a technique known as data mining.
SIMPLE LINEAR REGRESSION ANALYSIS
Regression analysis is a set of statistical techniques using past observations to find (or estimate)
the equation that best summarizes the relationships among key economic variables. The method
requires that analysts:
(1) collect data on the variables in question,

(2) specify the form of the equation relating the variables,
(3) estimate the equation coefficients, and
(4) evaluate the accuracy of the equation.
In general, the regression equation can be represented by the equation
Y = a + bX,
where a and b are constants and b≠0. We can compute these constants using the following equations
below.
a = y̅ - bx̅
where: y̅ = average of Y
x̅ = average of X
b = n(ΣXY) – (ΣX)(ΣY)
n(ΣX2) – (ΣX)2
where: n = number of observations or data
Example: A personnel manager would like to know if there is a relationship between knowledge factors
and practical factors of a training course. The following scores were obtained by six trainees on the
knowledge factors, X, and the practical factors, Y, in a training course.
Table 5.1 Hypothetical data showing relationship between Knowledge factors of trainees and practical
factors.
Trainee Knowledge Factors (X) Practical Factors (Y)
1 2 4
2 4 10
3 4 8
4 5 8
5 7 14
6 9 16
Solution:
Construct the table first
Trainee X Y XY X2 Y2
1 2 4 8 4 16
2 4 10 40 16 100
3 4 8 32 16 64
4 5 8 40 25 64
5 7 14 98 49 196
6 9 16 128 64 256
Summation (Σ) 30 60 346 174 696
Solving for a and b, then
b = 6 (346) – (30)(60) = 276 = 1.9167

6(174) – (30)2 144
a = y̅ - bx̅
𝚺𝐘
where y̅ = = 60/6 = 10
𝑁
𝚺𝐗
x̅ = = 30/6 = 5
𝑁
Thus, a = 10 – (1.9167) (5) = 0.4167
Then the regression equation would be Y = 0.4167 + 1.9167X

How to Use the Regression Equation
Once you have the regression equation, using it is a snap. Choose a value for the independent
variable (x), perform the computation, and you have an estimated value (ŷ) for the dependent variable.
You can use this regression equation to forecast sales, demand etc.
In our example, the independent variable is the trainee's score on the knowledge factors. The
dependent variable is the practical factors. If a trainee scored a 6 on the knowledge factor, the estimated
practical factor (ŷ) would be:
Ŷ = 0.4167 + 1.9167(6) = 11.92
Warning: When you use a regression equation, do not use values for the independent variable that are
outside the range of values used to create the equation. That is called extrapolation, and it can produce
unreasonable estimates.
In our example above, the knowledge factors scores used to create the regression equation
ranged from 2 to 9. Therefore, only use values inside that range to estimate Y. Using values outside that
range (less than 2 or greater than 9) is problematic.
How to Find the Coefficient of Determination
Whenever you use a regression equation, you should ask how well the equation fits the data. One
way to assess fitness is to check the coefficient of determination, which can be computed using a
formula.
Coefficient of determination, also known as R Squared determines the extent of the variance of
the dependent variable which can be explained by the independent variable. By looking at R2 value one
can judge whether the regression equation is good enough to be used. The higher the coefficient the
better the regression equation as it implies that the independent variable chosen in order to determine
the dependent variable is chosen properly.
The coefficient of determination can be thought of as a percent. It gives you an idea of how many
data points fall within the results of the line formed by the regression equation. The higher the
coefficient, the higher percentage of points the line passes through when the data points and line are
plotted. If the coefficient is 0.80, then 80% of the points should fall within the regression line. Values of 1
or 0 would indicate the regression line represents all or none of the data, respectively. A higher coefficient
is an indicator of a better goodness of fit for the observations.
The CoD can be negative, although this usually means that your model is a poor fit for your data.
It can also become negative if you didn’t set an intercept.
The coefficient of determination formula is given as:

where:
n = number of paired observations
∑XY = the sum of the products of X and Y
∑X2 = the sum of the squared values of X
∑Y2 = the sum of the squared values of Y
∑X = the sum of the values of X
∑Y = the sum of the values of Y
Using our example above and substituting the values from our table to the coefficient of
determination equation, we now have:
𝟔(𝟑𝟒𝟔)−(𝟑𝟎)(𝟔𝟎)
r = √𝟔(𝟏𝟕𝟒)−(𝟑𝟎) 𝟐 (𝟔)(𝟔𝟗𝟔)−(𝟔𝟎)𝟐 = 0.96
r2 = (0.96)2 = 0.92 or 92%
A coefficient of determination equal to 0.92 indicates that about 92% of the variation in the
knowledge factors of the trainees (the dependent variable) can be explained by the relationship to
physical factors of the trainees (the independent variable). This would be considered a good fit to the
data, in the sense that it would substantially improve a trainer’s ability to predict performance of the
trainees.
Potential Problems in Regression
1. Equation Specification – relationships between variables are not always linear.
2. Omitted Variables – leaving out key variables
3. Multicollinearity - when two or more explanatory variables move together.
4. Heteroscedasticity – when the variance concerning the error terms changes over time
5. Serial Correlation – when the error runs in patterns.
MULTIPLE LINEAR REGRESSION
Regression models are used to describe relationships between variables by fitting a line to the
observed data. Regression allows you to estimate how a dependent variable changes as the independent
variable(s) change.
Multiple linear regression is used to estimate the relationship between two or more
independent variables and one dependent variable. You can use multiple linear regression when you
want to know:
1. How strong the relationship is between two or more independent variables and one dependent
variable (e.g. how rainfall, temperature, and amount of fertilizer added affect crop growth).
2. The value of the dependent variable at a certain value of the independent variables (e.g. the
expected yield of a crop at certain levels of rainfall, temperature, and fertilizer addition).
Assumptions of multiple linear regression
Multiple linear regression makes all of the same assumptions as simple linear regression:
1. Homogeneity of variance (homoscedasticity): the size of the error in our prediction doesn’t change
significantly across the values of the independent variable.
2. Independence of observations: the observations in the dataset were collected using statistically
valid methods, and there are no hidden relationships among variables.
In multiple linear regression, it is possible that some of the independent variables are actually
correlated with one another, so it is important to check these before developing the regression model. If
two independent variables are too highly correlated (r2 > ~0.6), then only one of them should be used in
the regression model.
3. Normality: The data follows a normal distribution.
4. Linearity: the line of best fit through the data points is a straight line, rather than a curve or some
sort of grouping factor.
How to perform a multiple linear regression?
Multiple linear regression formula

The formula for a multiple linear regression is:
y = β0 + β1X1 + …. + βnXn + ε
 y = the predicted value of the dependent variable

 B0 = the y-intercept (value of y when all other parameters are set to 0)
 B1X1= the regression coefficient (B1) of the first independent variable (X1) (a.k.a. the effect that
increasing the value of the independent variable has on the predicted y value)
 … = do the same for however many independent variables you are testing
 BnXn = the regression coefficient of the last independent variable
 e = model error (a.k.a. how much variation there is in our estimate of y)
To find the best-fit line for each independent variable, multiple linear regression calculates three
things:
 The regression coefficients that lead to the smallest overall model error.
 The t-statistic of the overall model.
 The associated p-value (how likely it is that the t-statistic would have occurred by chance if the
null hypothesis of no relationship between the independent and dependent variables was true).
It then calculates the t-statistic and p-value for each regression coefficient in the model.
Note: While it is possible to do multiple linear regression by hand, it is much more commonly done via
statistical software such as SPSS, R, STATA, Excel etc. In this module, I will show how to use Excel
because it’s the most convenient and available for the students and
Sample Problem With Excel.
Consider the table below. It shows three performance measures for 10 students.
Student Test score IQ Study hours
1 100 125 30
2 95 104 40
3 92 110 25
4 90 105 20
5 85 100 20
6 80 100 20
7 78 95 15
8 75 95 10
9 72 85 0
10 65 90 5
Table 5.2 Hypothetical data showing the relationship between test scores, study hours and IQ .
In this example, using data from the table, we are going to complete the following tasks:
 Develop a least-squares regression equation to predict test score, based on (1) IQ and (2) the
number of hours that the student studied.
 Assess how well the regression equation predicts test score, the dependent variable.
 Assess the contribution of each independent variable (i.e., IQ and study hours) to the
prediction.
How to Enable Excel in your Computers?
When you open Excel, the module for regression analysis may or may not be enabled. So, before
you do anything else, you need to determine whether Excel is enabled. Here's how to do that:
 Open Excel.
 Click the Data tab.
 If you see the Data Analysis button in the upper right corner, the Analysis TookPak is enabled
and you are ready to go.
Figure 5.1 Data Analysis ToolPak as Shown in Excel.
These are common tasks in regression analysis. With the right software, they are easy to
accomplish. We'll walk you step by step through each task, starting with setting up Excel.
If the Data Analysis command is not available in your version of Excel, you need to load
the Analysis ToolPak add-in program. These instructions apply to Excel 2010, Excel 2013 and Excel
2016.
1. Click the File tab, click Options, and then click the Add-Ins category.
2. In the Manage box, select Excel Add-ins and then click Go.
3. In the Add-Ins available box, select the Analysis ToolPak check box, and then click OK.
Tip: If Analysis ToolPak is not listed in the Add-Ins available box, click Browse to locate it.
If you are prompted that the Analysis ToolPak is not currently installed on your computer,
click Yes to install it.
Data Entry With Excel
Data entry with Excel is easy. There are three main steps:
 Enter data on spreadsheet.

 Identify independent and dependent variables.
 Specify desired analyses.
To illustrate the process, we'll walk through each step, using data from our sample problem. First, we
want to enter data on an Excel spreadsheet. This means listing data for each variable in adjacent columns,
as shown below:
Figure 5.2 Entered Data in Excel Spreadsheet.
Next, we want to identify the independent and dependent variables. Begin by clicking the Data
tab and the Data Analysis button.
Figure 5.3
Data Tab and the Data Analysis Button in Excel.
This will open the Data Analysis dialog box. From the drop-down list, select "Regression" and
click OK.
Figure 5.4 Drop-down List of Analysis Found in Data Analysis Dialog Box.
Excel will display the Regression dialog box. This is where you identify data fields for the
independent and dependent variables. In the Input Y Range, enter coordinates for the independent
variable. In the Input X Range, enter coordinates for the dependent variable(s). If you include column
labels in these input ranges, check the Labels box. In the example below, we have included labels, so the
Labels box is checked.
Figure 5.5 Regression Dialog Box in Data Analysis ToolPak in Excel.
By default, Excel will produce a standard set of outputs. For this sample problem, that's all we
need; so click OK to generate standard regression outputs.
Data Analysis With Excel
Excel provides everything we need to address the tasks we defined for this sample problem. Recall
that we wanted to do three things:
 Develop a least-squares regression equation to predict test score, based on (1) IQ and (2) the
number of hours that the student studied.
 Assess how well the regression equation predicts test score, the dependent variable.
 Assess the contribution of each independent variable (i.e., IQ and study hours) to the prediction.
Let's review the output produced by Excel and see how it addresses each task.
Regression Equation
The first task in our analysis is to define a linear, least-squares regression equation to predict test
score, based on IQ and study hours. Since we have two independent variables, the equation takes the
following form:
ŷ = b0 + b1x1 + b2x2
In this equation, ŷ is the predicted test score. The independent variables are IQ and study hours,
which are denoted by x1 and x2, respectively. The regression coefficients are b0, b1, and b2. On the right
side of the equation, the only unknowns are the regression coefficients; so to specify the equation, we
need to assign values to the coefficients.
Excel does all the hard work behind the scenes, and displays the result in a regression coefficients
table:
Figure 5.6 Regression Coefficients Table as shown in Excel.
Here, we see that the regression intercept (b0) is 23.156, the regression coefficient for IQ (b1) is
0.509, and the regression coefficient for study hours (b2) is 0.467. So the least-squares regression
equation can be re-written as:
ŷ = 23.156 + 0.505 * IQ + 0.467 * Hours
This is the only linear equation that satisfies a least-squares criterion. That means this equation
fits the data from which it was created better than any other linear equation. In addition, this regression
equation can be used to predict or forecast test scores given the IQ of the student and hours spent in
studying.
Coefficient of Multiple Determination
The fact that our equation fits the data better than any other linear equation does not guarantee
that it fits the data well. We still need to ask: How well does our equation fit the data?
To answer this question, researchers look at the coefficient of multiple determination (R 2). The
coefficient of multiple determination measures the proportion of variation in the dependent variable
that can be predicted from the set of independent variables in the regression equation. When the
regression equation fits the data well, R2 will be large (i.e., close to 1); and vice versa.
The coefficient of multiple determination can be defined in terms of sums of squares:
SSR = Σ ( ŷ - y )2
SSTO = Σ (y - y)2
R2 = SSR / SSTO
where: SSR is the sum of squares due to regression, SSTO is the total sum of squares, ŷ is the predicted
value of the dependent variable, y is the dependent variable mean, and y is the dependent variable raw
score.
Luckily, you will never have to compute the coefficient of multiple determination by hand in this
case. It is a standard output of Excel (and most other analysis packages), as shown below.
Figure 5.7 Summary Output of the Regression Statistics.
A quick glance at the output suggests that the regression equation fits the data pretty well. The
coefficient of multiple determination is 0.905. For our sample problem, this means 90.5% of test score
variation can be explained by IQ and by hours spent in study.
An Alternative View of R2
The coefficient of multiple correlation (R2) is the square of the correlation between actual and
predicted values of the dependent variable. Thus,
R2 = r2y, ŷ
Where: y is the dependent variable raw score, ŷ is the predicted value of the dependent variable, and ry,
ŷ is the correlation between y and ŷ.
ANOVA Table
Another way to evaluate the regression equation would be to assess the statistical significance
of the regression sum of squares. For that, we examine the ANOVA table produced by Excel:
Figure 5.8 ANOVA table in Excel.
This table tests the statistical significance of the independent variables as predictors of the
dependent variable. The last column of the table shows the results of an overall F test. The F statistic
(33.4) is big, and the p value (0.00026) is small. This indicates that one or both independent variables has
explanatory power beyond what would be expected by chance at a 1 percent level. (One percent level
because if you multiply 0.00026 by 100 it gives you 0.026 which is still covered by 1 or is less than 1 if the
result is say 1.65 then it belongs to 5% because its greater than 1).
Like the coefficient of multiple correlation, the overall F test found in the ANOVA table suggests
that the regression equation fits the data well.
Significance of Regression Coefficients
With multiple regression, there is more than one independent variable; so it is natural to ask
whether a particular independent variable contributes significantly to the regression after effects of other
variables are taken into account. The answer to this question can be found in the regression coefficients
table:
Figure 5.9 Regression Results Highlighting the P-value of the Variables.
The regression coefficients table shows the following information for each coefficient: its value,
its standard error, a t-statistic, and the significance of the t-statistic. In this example, the t-statistics for
IQ and study hours are both statistically significant at the 0.05 level. This means that IQ contributes
significantly to the regression after effects of study hours are taken into account. And study hours
contribute significantly to the regression after effects of IQ are taken into account.
SOLVING MULTIPLE LINEAR REGRESSION BY HAND
In the absence of any software or computer tool to help us solve regression involving multiple
variables, we can do it using the long and traditional method; by hand. We have to take note of a few
relevant equations to help us do it.
We already know how to solve regression using one explanatory variable. This time, I will show
you equations that will be helpful to solve regression with 2 independent variable case.
For the one variable case, the calculation of b and a was:
For the two variable case:
and
The equation for a with two independent variables is:
Example:
Suppose a manager of an auto shop want to predict job performance of Chevy mechanics based on
mechanical aptitude test scores and test scores from personality test that measures conscientiousness.
The resulting tests are summarized in the table below (with 5 being the best performance and 1 the
lowest performance).
Table 5.2 Hypothetical table showing the effects of mechanical aptitude test scores and personality test
scores to job performance of Chevy mechanics.
Job Performance Mechanical Aptitude test Personality test on Conscientiousness
(Y) (X1) (X2)

1 40 25
2 45 20
1 38 30
3 50 30
2 48 28
3 55 30
3 53 34
4 55 36
4 58 32
3 40 34
5 55 38
3 48 28
3 45 30
2 55 36
4 60 34
5 60 38
5 60 42
5 65 38
4 50 34
3 58 38
Solution:
We can then construct an extended table above to show our computations on certain variables
of our equations such as:
Perf Mech Apt Consc

Y X1 X2 X1*Y X2*Y X1*X2
1 40 25 40 25 1000
2 45 20 90 40 900
1 38 30 38 30 1140
3 50 30 150 90 1500
2 48 28 96 56 1344
3 55 30 165 90 1650
3 53 34 159 102 1802
4 55 36 220 144 1980
4 58 32 232 128 1856
3 40 34 120 102 1360
5 55 38 275 190 2090
3 48 28 144 84 1344
3 45 30 135 90 1350
2 55 36 110 72 1980
4 60 34 240 136 2040
5 60 38 300 190 2280
5 60 42 300 210 2520
5 65 38 325 190 2470
4 50 34 200 136 1700
3 58 38 174 114 2204
𝚺Y=65 𝚺X1= 1038 𝚺X2=655 𝚺X1Y = 3513 𝚺X2Y= 2219 𝚺X1X2= 34510 Sum (𝚺)
20 20 20 20 20 20 Number of
observations
3.25 51.9 32.75 175.65 110.95 1725.5 Mean
We can now compute the regression coefficients as follows:

26260.25
𝑏1 = = 𝟎. 𝟎𝟖𝟔𝟒𝟎𝟗
303906.4
26622.7
𝑏2 = = 𝟎. 𝟎𝟖𝟕𝟔𝟎𝟐
303906.4
To find the intercept, we substitute the values of b1 and b2 from above. Thus we have:
Therefore, our regression equation is:
Y’ = -4.10 + 0.086409X1 + 0.087602X2 or
Job Performance = -4.10 + 0.086409MechApt + 0.087602Conscientiousness
This resulting regression equation can now be used to forecast Job Performance given
Mechanical aptitude scores and personality test scores.
For example:
What will be the job performance of worker X if his mechanical aptitude test is 57 and his personality test
score is 33?
Solution:
Job Performance = -4.10 + 0.086409MechApt + 0.087602Conscientiousness
= -4.10 + 0.086409(57) + 0.087602(33)
= 3.71 predicted job performance of worker x given his test scores
R-SQUARE (R2)
When we run a multiple regression, we can compute the proportion of variance due to the
regression (the set of independent variables considered together). This proportion is called R-square. We
use a capital R to show that it's a multiple R instead of a single variable r. We can also compute the
correlation between Y and Y' and square that. If we do, we will also find R-square.
We can compute Y’ by substituting the values of X1 and X2 to our regression equation Y’ = -4.10
+ 0.086409X1 + 0.087602X2. The residuals, on the other hand, can be calculated by taking the difference
between Y and Y’. Notice, we also computed the mean of each variables. The mean is found at the
bottom of the table. This mean is useful to be able for us to solve for the variance of each variables. The
variance is defined as the average of the squared differences from the mean. To compute the variance,
we need to subtract the mean from each numbers and square the result. After which we work out the
average of those squared differences. (Look at the separate excel file for your reference as to how
variance were computed per variable).
Y X1 X2 Y’ Residual/Error
(Y’ = -4.10 + 0.086409X1 + 0.087602X2) (Y-Y’)

1 40 25 1.55 -0.55
2 45 20 1.54 0.46
1 38 30 1.81 -0.81
3 50 30 2.85 0.15
2 48 28 2.50 -0.50
3 55 30 3.28 -0.28
3 53 34 3.46 -0.46
4 55 36 3.81 0.19
4 58 32 3.71 0.29
3 40 34 2.33 0.67
5 55 38 3.98 1.02
3 48 28 2.50 0.50
3 45 30 2.42 0.58
2 55 36 3.81 -1.81
4 60 34 4.06 -0.06
5 60 38 4.41 0.59
5 60 42 4.76 0.24
5 65 38 4.85 0.15
4 50 34 3.20 0.80
3 58 38 4.24 -1.24
Total 65 1038 655 65.07 -0.07
Mean 3.25 51.9 32.75 3.25 0.00
Variance 1.49 54.59 26.0875 1.00 0.49

The mean of both Y and Y’ is 3.25. The mean of the residuals is 0. (Mean of residuals should always
equal to zero if not your regression line isn’t the best fit). The variance of Y is 1.4875. The variance of Y' is
1.0, and the variance of the residuals is 0.49. Together, the variance of regression equation (Y') and the
variance of error (e) add up to the variance of Y (1.49 = 1.0+.49). R-squared value is therefore 1.0/1.49 or
.6711. This means that 67% of the variation or changes in Job performance can be explained by
mechanical aptitude test scores and conscientiousness.
Final Thoughts
This lesson was all about multiple regression analysis. We used Excel, but the analysis would be much
the same with other software packages. All major software packages (SAS, SPSS, Minitab, Excel etc.)
produce three key outputs:
 Regression coefficients, based on a least-squares criterion.

 Measures of goodness of fit, like a coefficient of multiple determination and/or an overall F test.
 Significance tests for individual regression coefficients.
If you can interpret these regression outputs from Excel, you should have no trouble interpreting the
same outputs from other packages.
FORECASTING MODELS
Forecasting models often are divided into two main categories: structural and nonstructural
models. Structural models identify how a particular variable of interest depends on other economic
variables. Sophisticated large-scale structural models of the economy often contain hundreds of
equations and more than a thousand variables and usually are referred to as econometric models.
Nonstructural models focus on identifying patterns in the movements of economic variables over
time. One of the best-known methods, time-series analysis, attempts to describe these patterns
explicitly. A second method, barometric analysis, seeks to identify leading indicators—economic
variables that signal future economic developments. (The stock market is one of the best-known leading
indicators of the course of the economy.)
Time-Series Models
Time-series models seek to predict outcomes simply by extrapolating past behavior into the
future. Time-series patterns can be broken down into the following four categories.
1. Trends -a steady movement in an economic variable over time. For example, the total production of
goods and services in the United States (and most other countries) has moved steadily upward over the
years.
2. Business cycles -periods of expansion marked by rapid growth in gross domestic product (GDP),
investment,
and employment. Then economic growth may slow and even fall.
3. Seasonal variations - are shorter demand cycles that depend on the time of year. Seasonal factors
affect tourism and air travel, tax preparation services, clothing, and other products and services.
4. Random fluctuations - irregular movements in a short period of time due to essentially random (or
unpredictable) factors.
Barometric Models
Barometric models search for patterns among different variables over time. Consider a firm that
produces oil drilling equipment. Management naturally would like to forecast demand for its product. It
turns out that the seismic crew count, an index of the number of teams surveying possible drilling sites,
gives a good indication as to changes in future demand for drilling equipment. For this reason, we call the
seismic crew a leading indicator of the demand for drilling equipment. Thus, Barometric methods
(leading indicators) are used to forecast the general course of the economy and changes in particular
sectors.
REFERENCES
Bevans, R. 2020. An Introduction to Multiple Linear Regression.

https://www.scribbr.com/statistics/multiple-linear-regression/
Regression with 2 Independent Variables.

http://faculty.cas.usf.edu/mbrannick/regression/Part3/Reg2.html
Samuelson, W.F. and Marks, S.G. 2012. Managerial Economics. John Wiley and Sons, Inc, 7th ed.
Standard Deviation and Variance. Math is Fun Advanced. https://www.mathsisfun.com/data/standard-

deviation.html
Statistics How to. Coefficient of Determination (R Squared): Definition, Calculation

https://www.statisticshowto.com/probability-and-statistics/coefficient-of-determination-r-squared/
Stat Trek. Teach Yourself Statistics. Linear Regression Example.

https://stattrek.com/regression/regression-example.aspx
Mitchell, A. 2019. Where is the data analysis button in Excel? http://libanswers.walsh.edu/faq/147605
WallStreet Mojo. Coefficient of Determination

https://www.wallstreetmojo.com/coefficient-of-determination

Basic Estimation Techniques

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Estimation Techniques

Uploaded by

Copyright:

Available Formats

BASIC ESTIMATION TECHNIQUES

1. Identify the different forecasting techniques.

B. Controlled Market Studies

C. Uncontrolled Market Data

SIMPLE LINEAR REGRESSION ANALYSIS

(1) collect data on the variables in question,

In general, the regression equation can be represented by the equation

where: n = number of observations or data

Construct the table first

Solving for a and b, then

b = 6 (346) – (30)(60) = 276 = 1.9167

Thus, a = 10 – (1.9167) (5) = 0.4167

Then the regression equation would be Y = 0.4167 + 1.9167X

Ŷ = 0.4167 + 1.9167(6) = 11.92

How to Find the Coefficient of Determination

The coefficient of determination formula is given as:

r2 = (0.96)2 = 0.92 or 92%

Potential Problems in Regression

1. Equation Specification – relationships between variables are not always linear.

2. Omitted Variables – leaving out key variables

3. Multicollinearity - when two or more explanatory variables move together.

5. Serial Correlation – when the error runs in patterns.

MULTIPLE LINEAR REGRESSION

3. Normality: The data follows a normal distribution.

How to perform a multiple linear regression?

Multiple linear regression formula

 y = the predicted value of the dependent variable

Sample Problem With Excel.

How to Enable Excel in your Computers?

Data Entry With Excel

 Enter data on spreadsheet.

Figure 5.5 Regression Dialog Box in Data Analysis ToolPak in Excel.

Data Analysis With Excel

Figure 5.6 Regression Coefficients Table as shown in Excel.

ŷ = 23.156 + 0.505 * IQ + 0.467 * Hours

Coefficient of Multiple Determination

Figure 5.7 Summary Output of the Regression Statistics.

Significance of Regression Coefficients

Figure 5.9 Regression Results Highlighting the P-value of the Variables.

For the one variable case, the calculation of b and a was:

For the two variable case:

The equation for a with two independent variables is:

(Y) (X1) (X2)

Perf Mech Apt Consc

We can now compute the regression coefficients as follows:

Therefore, our regression equation is:

Y’ = -4.10 + 0.086409X1 + 0.087602X2 or

Job Performance = -4.10 + 0.086409MechApt + 0.087602Conscientiousness

Job Performance = -4.10 + 0.086409MechApt + 0.087602Conscientiousness

= -4.10 + 0.086409(57) + 0.087602(33)

= 3.71 predicted job performance of worker x given his test scores

(Y’ = -4.10 + 0.086409X1 + 0.087602X2) (Y-Y’)

Total 65 1038 655 65.07 -0.07

Mean 3.25 51.9 32.75 3.25 0.00

Variance 1.49 54.59 26.0875 1.00 0.49

 Regression coefficients, based on a least-squares criterion.

Bevans, R. 2020. An Introduction to Multiple Linear Regression.

Regression with 2 Independent Variables.

Standard Deviation and Variance. Math is Fun Advanced. https://www.mathsisfun.com/data/standard-

Statistics How to. Coefficient of Determination (R Squared): Definition, Calculation

Stat Trek. Teach Yourself Statistics. Linear Regression Example.

WallStreet Mojo. Coefficient of Determination

You might also like