Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

2017 2nd IEEE International Conference on Computational Intelligence and Applications

Regression Model for Appraisal of Real Estate Using Recurrent Neural Network
and Boosting Tree

Junchi Bin∗, Shiyuan Tang†, Yihao Liu∗, Gang Wang‡, Bryan Gardiner§, Zheng Liu∗, Eric Li¶
∗School of Engineering, University of British Columbia, Kelowna, BC, Canada
e-mail: junchibin@alumni.ubc.ca, yihaoliu007@gmail.com, zheng.liu@ubc.ca
†School of Foreign Languages and Literature, Tianjin University, Tianjin, China, e-mail: lglsbd1992@163.com
‡College of Architecture and Civil Engineering, Guilin University of Technology, Guilin, Guangxi, China
e-mail: 1059020529@qq.com
§Data Nerds, Kelowna, BC, Canada, e-mail: bryan@datanerds.com
¶Faculty of Management, University of British Columbia, Kelowna, BC, Canada
e-mail: eric.li@ubc.ca

Abstract—Automated valuation model (AVM) is a mathemat- the market value of properties. For example, our partner
ical program to estimate the market value of real estates based organization, Data
on the analysis of locations, neighborhood characteristics, and Nerds, provides an extensive amount of data to evaluate
relevant property characteristics. The most common AVMs the price of real estate based on property data, transaction
em-ployed by the appraisal industry are based on multiple history, and area information(https://factory.datanerds.com/).
regression analysis. Other analytic tools such as statistical Clients could browse these data and decide their next move.
learning and fuzzy algorithms have become more popular The house price index (HPI) is a time-series parameter
because of the increasing capability of collecting a high volume H
to measure the movement of single-family houses published
of data and the advancement of machine learning. The new
by the federal government. It serves as an accurate indicator
analytic model thus becomes possible to build a more
sophisticated model to exploit the information embedded in the
of the overall trends of house price in geographic levels
collected data. In this work, we proposed a boosting tree model such as CBSA and zip code. For example, there is an HPI
facilitated with a Recurrent Neural Network (RNN) to forecast H1 and a variable P1 which indicates house price in the
the average price of an area. The experimental results indicate same area in a certain year. Given H2 in another year, we
that our model outperforms the existing models adopted in the can calculate the house price in another year with an
appraisal industry. equation P2 = P1H2 . Therefore, AVM firstly predicts the
market value of a house based on attributes of properties,
Keywords-appraisal of real estate; regression model; and forecast HPIs next few years or few months. Then, it
boosting tree; recurrent neural network
calculates the future house price with above equation.
I. INTRODUCTION There are multiple machine learning methods introduced
to the field in the past decade. LASSO regression (LASSO)
For most of the consumers in North America, housing and support vector regression (SVR) are modern machine
has been one of the largest expenses. Purchasing a house is learning approaches for both prediction and forecasting
a high involvement decision. Consumers’ judgment on the HPIs in AVM. Boosting tree model is a promising machine
value of the property and their estimation on the future learning method in data analysis competition. The basic
value of the property would influence their purchasing idea of the boosting tree model is to combine many low-
decision and their budget allocation. Moreover, the price of accuracy regression trees models into a model with high
real estate is one of the important factors to reflect the accuracy [1], [2]. The model will continue to iterate in the
economic activities. An accurate prediction of the price of direction of gradient descent. In AVM, boosting tree model
land, therefore, can help governments or companies make a is to predict the house price based on attributes of
crucial decision for manipulating the financial condition for properties.
future fiscal years. From this perspective, the process of The Artificial Neural Network (ANN) is a powerful tool
estimating the price of real estate is closely related to both for nonlinear regression inspired by the mechanism of how
people lives and national economy. the brain works. In recent years, ANN has played a critical
Automated valuation model (AVM) is a mathematical role in many industrial applications due to its capability of
pro-gram to assess the market value of real estate based on learning feature from data, which produces a highly accu-
the analysis of locations, neighborhood conditions and the rate performance on pattern recognition tasks such as image
characteristics of properties. Regression analyses and classification and natural language process. However,
machine learning algorithms are widely used in the current previous studies reported that the predictions with ANN for
appraisal industry. Some businesses in estate industry house properties were not stable [3], [4]. One research
provide easy- access web applications of AVM to estimate suggested that linear regression may be even better than

978-1-5386-2030-4/17/$31.00 ©2017 IEEE 209


ANN with selected small number of the sample [5]. The
recurrent neural network is a state-of-art model for h(t) = f (Ux(t) + Wh(t−1)) (1)
processing sequential data such as time series data or
natural languages. Long short-term memory (LSTM) is one
where f is generally a non-linear activation function such as
of the most successful RNNs architectures. Some designs of
LSTM were used to forecast stock market and solar power tanh or ReLU with shared parameters U, W [7]. The O(t) is
output which demonstrate its power in forecasting [1], [6]. the output of step t which depends on activation function in
In this study, we propose an ensemble learning method, current neuron as in Eq.(2).
which combines the LSTM for forecasting HPIs with a
boost-ing tree model to predict house price in Chicago. O(t) = 6(Vh(t)) (2)
Conventional machine learning methods such as LASSO
and SVR in AVM are the baseline algorithms for the
proposed model. where 6 represents the activation function for the output
In this study, we propose an ensemble learning method, layer.
which combines the LSTM for forecasting HPIs with a Theoretically, RNNs can handle context from the
boost-ing tree model to predict house price in Chicago. beginning of the sentence which will allow more accurate
Conventional machine learning methods such as LASSO predictions of a word at the end of a sentence. However, the
and SVR in AVM are the baseline algorithms for the longer length of sequence requires more hidden layers
proposed model. which cause vanishing gradient problem for preventing
from optimization of RNN.
II. M ODELING WITH LSTM AND BOOSTING TREE LST is architecture to solved the problem. Each LSTM
splits the whole neural networks into multiple cells
In this section, we briefly introduce the major
components of the proposed model. First, we introduce the C(1),...,C(T )}. Each cell contains an input gate, forget gate
basic archi-tecture of RNN and long-short term memory and output gate as illustrated in Fig. 2, which is capable
(LSTM). Then, we introduced the boosting tree model and of memorizing the error in forward propagation stage. The
baseline machine learning methods respectively. forget gate drops the error from cell to solve vanishing
gradient.
A. Long Short-Term Memory
In natural language processing (NLP), the whole
sentence is defined as a sequential data and each word based
on an understanding of previous words. When ANNs
perform natural language processing, it requires a structure
to reason the next word depending on the context of the
sentence, which combines the previous outputs as inputs for
inference. Recurrent Neural Networks (RNNs) are a family
of neural networks for processing sequential data [7].

Figure 2. The architecture of Long Short-Term Memory (LSTM).

Wf , Wc, and Wo are corresponding parameters for input,


Figure 1. An illustration of RNN.
forget and output gate, respectively. Input gate combines the
current input with previous output with activation function
σ and bias b f in the neuron. Then, the tanh creates new
candidates for cell value and compares to the previous value
Fig. 1 illustrates the structure of a simple RNN. {O(1),..., for decision of update with bias bi and bc respectively. The
O(T )} are the hidden layers of the neural network given following equations give the details of the calculation [7].
the input sequence {x(1), ..., x(T )} and hidden units
h(1), ..., h(T )}. There is a one-way flow of information ft = σ (Wf [h(t−1), x(t)] + b f ) (3)
from the input units to reach the hidden units, while another
one-way flow of information from the hidden units to reach
the output units. h(t) is calculated based on the output of the (4)
current input layer and the state of the previous hidden layer
h(t−1), which is usually estimated as in Eq.(1) [7]. ot = σ (Wo[h(t−1), x(t)] + bo) ∗ tanh(ct + ft ) (5)

210
B. Boosting Tree regres- sion (SVR) is given by
A boosting tree can be represented by Eq.(6):

m
F(x; P) = ∑ βmh(x; αm) (6) (10)
m=1

where P denotes parameters or a set of multiple where g j (x), j = 1, 2...m, denotes a set of non-linear
parameters, P = {p0, p1, p2,...}. F(x; P) represents a transfor- mations, y j stands for the predicted value and b is
function with x as set of variables given parameter P. The the bias term. SVR always use ε -intensive loss as loss
boosting ensembles multiple learning models {h(x; α1), function for training [4].
2) LASSO Regression: LASSO is a linear regression
h(x; α2),..., h(x; αM )} with different weights βm and
model with l1[7]
parameters αm [2].
n
The next step is to compute the residual, which is the
difference between the observe value and the current βˆlasso = arg min ∑(yi − (β0 + β Txi))2 + λ "β "1 (11)
predicted value yi for each iteration. Then, We further
optimize the parameters according to the previous model to III. EXPERIMENTAL RESULTS
the end of the iteration along the direction of gradient
descent. The following equations illustrate how the process In the experiments, all the data are obtained from our
evolves with the learning rate ρ [2], [8], [9]. partner organization, Data Nerds. The collected data are
from Chicago of Illinois, one of the biggest city in the USA.
This section presents how to pre-process the data and
(7) evaluate the proposed model in comparison with the SVR
and LASSO regression.
N
A. Real Estate Data Pre-processing
ρm = argmin ∑ L(yi, Fm−1(xi)+ ρh(xi; αm, βm)) (8)
i=1 The HPI dataset is offered by the federal government.
The whole dataset contains all HPIs across the United States
Fm(x) = Fm−1(x)+ ρmh(x; αm) (9) for all geographical levels from 1975 to 2015. In this
experiment, we extracted 60 series of HPIs in Chicago over
C. Baseline Models zip codes level. Moreover, we used the HPIs before 2013 to
1) Support Vector Regression: The support vector forecast the HPIs on 2014 and 2015.

Figure 3. The results after feature selection, and the accepted features are indicated by green.

211
The original dataset contains many variables such as the B. Training Methods
qualities of houses, geographic information of properties. It Fig. 4 shows the training process for the entire model.
also contains transactional records of house prices across First, the filtered data from pre-processing contains
time. Here, we only selected the most recent records (within attributes of houses and HPIs from 1975 to 2013 as
2013) and convert the prices of 2014 and 2015 by HPIs as described previously. Second, multiple LSTMs were
ground truths of this experiment. We filtered out the house employed to forecast the HPIs in each zip code level
with either extremely high or low price across the city. respectively. It is a single hidden layer LSTM with 4
Table I shows the summary of filtered data. neurons of activation of ReLu. The window size is three
TABLE I. THE AVERAGE PRICE AND STANDARD DEVIATION OF THE
which means the forecasting HPIs are predicted by three
PRICE IN CHICAGO previous HPIs. Simultaneously, the boosting tree model is
obligated to predict the house price on 2013 based on
# of Houses Avg. Std. attributes of properties. Finally, we use the forecasting HPIs
Chicago 5179 208.2 118.6 and house price on 2013 to assess the house price on 2014
and 2015.
Feature selection is a major step in the application of C. Evaluating Process
machine learning methods. The data sets come with too
many variables for modeling. There are two reasons to We randomly split the entire data into training (80%)
select the features. One is that too large feature sets will and testing set (20%). The evaluation metrics employed are
slow down the algorithm and the other reason is that it will mean absolute error (MAE) and mean absolute percentage
cause inaccuracy of machine learning when the number of error (MAPE). Eq.(10) and (11) give the definitions of the
variables is signif-icantly higher than optimal. Therefore, it two metrics, where ti is the ground truth and pi is the
is critical to select the optimal features according to predicted value.
contribution and correlation of the ground truth. Boruta is a
method of feature selection based on random forest and
employed in our experiment to conduct the feature
extraction [10]-[12]. After feature selection, only 19 (12)
features were used to build the model. The result of feature
selection is shown in Fig. 3. The green indicators refer to
the accepted features.
To train and validate our models and prevent overfitting, (13)
we apply the 5-folds cross-validation technique. The
algorithm randomly split the complete data into five D. Results
subsets. A unique subset is regarded as the validation data We use the same training and testing set to evaluate all
for testing, and the remaining four subsets are used for the models. Table II gives the regression results for all the
training in each validation procedure. After 5-fold cross different models. The results indicate that the proposed
validation, we can obtain a predicting price of each house. model has better performance than the other two models.
TABLE II. RESULT COMPARISON

MAE MAPE
LASSO 58.661 31.24%
SVR 57.225 29.83%
Proposed Model 46.926 24.03%

IV. CONCLUSIONS
In this paper, we proposed an ensemble learning
regression model for real estate appraisal. The proposed
model is capable of taking qualities of houses, location, and
trend of market price into account. The experimental results
show the effec-tiveness of the design. The work also offered
a new approach to ensemble deep learning methods with the
statistical learning algorithm. From this project, we discover
that LSTM has great potential in AVM or other regression
applications. However, our model didn’t consider spatial
and temporal factors in estimating housing value. In future
studies, we are intending to use LSTM or other deep
learning techniques to build a spatial-temporal model. We
hope our work can give a new insight to real estate
Figure 4. Training process.
appraisals.

212
Forecasting in Hong Kong. Journal of Property Research, 25(4):321–
REFERENCES
342, 2008.
[1] Kai Chen, Yi Zhou, and Fangyan Dai. A LSTM-Based method for [7] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The
stock returns prediction: A case study of China Stock. In 2015 IEEE Elements of Statistical Learning. Springer-Verlag New York Inc.,
International Conference on Big Data, 2015. 2009.
[2] Villius Kontrimas and Antans Verikas. The mass appraisal of the [8] Tianqi Chen and Carlos Guestrin. XGBoost: Reliable Large-scale
real estate by computational intelligence. Applied Soft Computing, Tree Boosting System. arXiv, pages 1–6, 2016.
11(1): 443–448, 2011.
[9] Jerome H. Friedman. Greedy Function Approximation: A Gradient
[3] Nils Raabe. Deep Learning for Solar Power Forecasting An Boosting Machine. The Annals of Statistics, 29(5):1189–1232, 2011.
Approach Using Autoencoder and LSTM Neural Networks. pages
2858–2865, 2016. [10] Max Kuhn and Kjell Johnson. Applied Predictive Modeling. 2013.
[4] Vladimir N. Vapnik. Statistical Learning Theory. JOHN WILEY & [11] Miron B. Kursa, Aleksander Jankowski, and Witold R. Rudnicki.
SONS INC, 1998. Boruta-A system for feature selection. Fundamenta Informaticae,
101(4):271– 285, 2010.
[5] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep
learning. MIT Press, Cambridge, 2016. [12] Miron B Kursa and Witold R Rudnicki. Feature Selection with the
Boruta Package. Journal Of Statistical Software, 36(11):1–13, 2010.
[6] K. C. Y. Lam, C. Y. Yu, and K. C. Y. Lam. An Artificial Neural
Network and Entropy Model for Residential Property Price

213

You might also like