Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 44

A Critical Examination of

Hedonic Analysis of a
Regression Model (HARM)
and
META-ANALYSIS 

Albert R. Wilson
BSSE, MBA, CRE (Ret)

2
 Regression Model

A model
intended to allow an exploration
of the hypothetical relationship
between possible explanatory variables
and the sales price

3
Regression Model

• Reflection of reality
• The touchstone of that reality? Actual
market participants

4
“Estimated” versus “Predicted”

• Estimated = Sale IN database

• Predicted = Sale NOT IN database

5
Predicted Sales Prices

At the mean
predicted sales price variance
is larger than estimated variance
by σ2 (variance in the data)

6
Mean Confidence Intervals (MCI)
Estimated and Predicted

MCI FOR PREDICTED 4.38 TIMES MCI FOR ESTIMATED


7
DATABASE EDITING 

GARBAGE IN => GARBAGE OUT


(GIGO) 

8
Case Example
Influence on the Removal of
“Flipping Transactions” on the
Predicted Prices for 33 Properties

PREDICTED SALES PRICES

PROPERTY NO. AS PRESENTED FLIPS REMOVED % CHANGE

SUM 5,069,239 4,018,112 (1,051,127)

n 391 379 -12

Adj. R-squared 0.7684 0.7593 -0.0091

9
Editing and Confirmation of Data
STEP 1:

Edit to identify obvious issues (the desk edit)

Case Example
Assessor’s Data 4,325 Removed 747 17.3%
R-Squared 0.79 0.83

MLS Data 1,888 Removed 779 44.3%

10
Editing and Confirmation of Data
STEP 2:

Identify sales that are not appropriate to the


analysis

11
Editing and Confirmation of Data
STEP 3:
Sales confirmation
•A values-neutral interview of sale participants
•OBJECT: to elicit the primary factors motivating the
conclusion of the sale price

MUST NOT INTRODUCE ANALYST OPINION


THIS IS THE ONLY MEANS OF
IDENTIFYING/CONFIRMING THE REASONS
FOR A CONCLUDED PRICE
12
Regression Model Considerations
Faithfully represent:
•Identified concerns of actual market
participants
•Restrictions imposed by the data

Estimates of prices
the ONLY VERIFIABLE OUTPUT

13
Coefficient Calculation

Result of iterative calculations


designed to provide the
most accurate estimates of sales prices
in database

14
Coefficient Calculation

Goodness of Fit
•Measures of the Goodness of Fit apply only
to the relationship between the estimated
and actual sales prices in the database
•They do not apply to the coefficients

15
Most commonly-cited
Goodness-of-Fit Measure

R-Squared
(Coefficient of Determination)

16
R-Squared

• Generally-applied interpretation:
– R-Squared is the amount of variance
“explained” by the model

17
Low R-Squared Models

Mathematically, as the R-Squared


approaches 0.30, it becomes
more likely
that the model is only measuring
random effects

18
The Omitted and Additional
Variable Problem
•Omitting generally increases magnitude
and statistical significance of the remaining
coefficients
•Adding generally decreases the
magnitude and statistical significance of the
remaining variable coefficients

19
Illustration of Omitting or Adding a Variable
  Base Model Added Variable–APN Omitted Variable–Pool
%
Variable Coeff. t-stat Coeff. t-stat % Change Coeff. t-stat
Change
Intercept 67,370 17.52 -663,632 -8.14 -1085.06% 66,293 17.14 -1.60%
APN     .023 8.98        
Fixtures 2,653 5.39 2,511 5.15 -5.35% 2,886 5.84 8.74%
NoPatio (12,801) -7.77 (5,036) -2.73 -60.66% (13,451) -8.13 5.08%
SqFt 40.79 29.23 42.80 30.61 4.93% 41.59 29.72 1.96%
Pool 8,366 6.77 8,908 7.28 6.48%    
Garage 19,382 12.90 20,153 13.54 3.98% 19,980 13.24 3.09%
Middle Ring (16,141) -11.24 (11,230) -7.38 -30.43% (15,276) -10.61 -5.36%
Inner Ring (8,875) -4.52 (7,114) -3.64 -19.84% (8,012) -4.06 -9.72%
2000 207 0.08 1,787 -0.67 763.29% 271 0.10 30.92%
2001 (2,017) -0.76 665 0.258 -132.97% (2,028) -0.76 0.55%
2002 (719) -0.25 3,976 1.36 -652.99% (615) -0.21 -14.46%
2003 7,213 2.67 7,647 2.86 6.02% 7,258 2.71 0.62%
2004 41,149 15.50 40,380 15.37 -1.87% 40,901 15.31 -0.60%
2005 132,077 51.04 130,662 50.93 -1.07% 131,129 50.43 -0.72%
2006 160,367 45.29 159,842 45.63 -0.33% 159,897 44.89 -0.29%
R-Squared   0.83   0.83     0.83  

20
Consequences of Variable Selection
Including the Assessor’s Parcel Number

APN Coefficient Value 0.023


t-statistic 8.98
Mean Value 30,834,360
R-Squared 0.83
Mean Sale Price $211,000

Results in an incremental increase in the sales price of


0.023 x 30,834.360 = $709,190
(APN Coef.) x (Mean Value) = (Incremental Increase)

21
Consequences of Variable Selection
Omission of a Variable:
•Removal of “Pool”; present in 38% of properties
– SQFT Cofficient changed from $40.79 to $41.79
– Approximately the same t-statistic

•Removal of “Fixtures”; present in 100% of


properties
– SQFT Coefficient changed from $40.79 to $46.50
– T-statistic = 50.94

22
Coefficients

Coefficients are simply


multipliers for the explanatory variable

23
Causation in Real Estate
From the Real Estate Appraiser’s perspective:
1.Causation demonstrated through sales
confirmation interviews.
2.Causation NEVER proven through a
regression.

24
Strengths and Weaknesses
• Can never be better than the data
• Requires significant amount of data: five to 15 or more
sales
• Upper limit to the amount of data: too much may be
worse than too little
• Guide: Are the sales competitive to the subject?
• Estimate of sales prices most accurate at the mean value
of the data
• Variance of a predicted sales price larger than variance of
estimated
• Thousands of possible regression models

25
Further Considerations

• Absent standards, the “Rubber Ruler” may


apply

• When recognized and published standards


are not used, author must demonstrate
the accuracy and reliability of his/her work

26
Hedonic Analysis
The Hedonic Assumption

The coefficient accurately and only


represents the contribution of the
declared meaning of the
explanatory variable to the
sale price

28
Hedonic Analysis

The validity of the hedonic assumption


must be demonstrated

29
“Revealed Preference”
Idea cannot be supported
for real estate
Supporting Literature 
Not a single paper demonstrated the validity
of the hedonic assumption

PLUS
•NO indication of confirmation of raw data
•NO indication of adherence to any recognized / published
standards
•NO indication of confirmation of results with the normal or
typical market participant

THE RUBBER RULER EFFECT IS MUCH IN EVIDENCE.


31
Regression Model Accuracy
If the regression model is inaccurate,
then there is no reason
to expect the coefficients to be
accurate or meaningful.
Therefore the HARM cannot be accurate.

32
CASE EXAMPLE
TO POOL OR NOT TO POOL

• Using the data from the previous case.


• Does a pool influence value?
• By how much?
• The Hedonic Approach, the coefficient is the
marginal contribution to value.

33
COMBINED POOL AND NO POOLS,
COMBINED POOL AND NO POOLS
POOL COEFFICIENT SET TO ZERO

MEAN EXPECTED MEAN EXPECTED


Variable COEFFICIENT COEFFICIENT
VALUES VALUES VALUES VALUES

Intercept 54,089.83 1 54,090 54,089.83 1 54,090


ORIG_FIXTURES 2,805.33 8.73 24,491 2,805.33 8.73 24,491
ORIG_NOPATIO -14,116.47 0.34 -4,800 -14,116.47 0.34 -4,800
ORIG_POOL 9,161.98 0.38 3,482 9,161.98 0 0
ORIG_SQF 41.52 2283.62 94,815 41.52 2283.62 94,815

ORIG_X_3GARAGE 16,212.83 0.4 6,485 16,212.83 0.4 6,485

SY2000 5,980.33 1 5,980 5,980.33 1 5,980


EXPECTED MEAN
  184,543     181,061
SALE PRICE
Adj R2 0.8816     0.8816    

34
TO POOL OR NOT TO POOL (CONT.)

• What are the coefficients if there is no pool?

35
COMBINED WITH NO POOL VARIABLE

Variable COEFFICIENT MEAN VALUES EXPECTED VALUES


Intercept 52788.1063 1 52,788
ORIG_FIXTURES 3,087.8801 8.73 26,957
ORIG_NOPATIO -14,724.7843 0.34 -5,006
     
ORIG_SQF 42.3986 2283.62 96,822
ORIG_X_3GARAGE 16,924.691 0.4 6,770
SY2000 5,727.7462 1 5,728
EXPECTED MEAN
    184,059
SALE PRICE
Adj R2 0.8790    

36
Comparision
• Orig Fixt 2,805 3,088
• Orig-nopatio -14,116 -14,725
• Orig-no pool 9,162 NA
• Orig-sqf 41.52 42.40
• Orig-garage 16,213 16,925
• SY2000 5,980 5,728
• ESP $184,513 $184,059
• R-sq 0.88 0.88
37
POOL OR NOT TO POOL (CONT.)

• WHAT HAPPENS IF WE CONSIDER A DATABASE


WITH POOLS, AND SEPARATELY A DATABASE
WITHOUT POOLS?

38
WITH POOL ON PROPERTY WITHOUT POOL ON PROPERTY

MEAN EXPECTED MEAN EXPECTED


Variable COEFFICIENT COEFFICIENT
VALUES VALUES VALUES VALUES

Intercept 65,957.89 1.00 65,958 54,993.78 1.00 54,994

ORIG_FIXTURES 2,505.59 9.65 24,179 2,784.14 8.16 22,719

ORIG_NOPATIO -15,415.46 0.22 -3,391 -14,838.47 0.41 -6,084

ORIG_POOL            

ORIG_SQF 41.63 2,586.79 107,690 41.46 2,097.20 86,956

ORIG_X_3GARAGE 15,768.93 0.40 6,308 16,308.32 0.31 5,056

SY2000 4,211.37 1.00 4,211 7,209.87 1.00 7,210


EXPECTED MEAN
    204,954     170,850
SALE PRICE
Adj R2 0.08711     0.8895    

39
POOLS AND NO POOLS
SEPARATELY

• ESTIMATED SALE PRICE WITH POOL $204,954


– R-SQUARED 0.87
• ESTIMATED SALE PRICE W/O POOL $170,805
– R-SQUARED 0.89

40
The Coefficient – What Counts?

ALL THAT STATISTICAL SIGNIFICANCE CAN TELL


US IS THAT
FOR THIS MODEL AND DATABASE
THE COEFFICIENT IS A SIGNIFICANT
(OR INSIGNIFICANT)
MULTIPLIER FOR THE EXPLANATORY VARIABLE.
NOTHING MORE.

41
The Appropriate Standard:
Economic Significance
For us, economic significance
is determined by
what the normal or typical participant
considers important to the
conclusion of the transaction.

42
A Criticality:
NOT ONE hedonic analysis encountered
to date has actually asked this question:

“What was important to you in


concluding your transaction?”

43
Hedonic Analysis of a Regression
Model (HARM) is:
• Highly inaccurate and unreliable method
• Not appropriate for appraisal work

Observations apply to hedonic analysis


NOT
regression models!

44

You might also like