Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Table of Contents

1. INTRODUCTION................................................................................................270
2. Hypothesis Testing...........................................................................................270
3. RELATED RESEARCH SURVEY............................................................................271
4. PROPOSED MACHINE LEARNING ALGORITHMS MODEL....................................272
4.1 ANOVA.....................................................................................................................272
4.2 LINEAR REGRESSION.........................................................................................272
4.3 RANDOM FOREST.....................................................................................................272
4.4 DECISION TREE..........................................................................................................273
4.5 SUPPORT VECTOR MACHINE.....................................................................................273
5. METHODOLOGY...............................................................................................273
6. CONCLUSIONS..................................................................................................274
7................................................................................................................................274

1. INTRODUCTION
Between the year 2019 and 2020, the economy expands did not just diminished but also
real estate business decelerated. Hence the examiner sight that most businesses would
be strained on both supply and demand sides. (C, et al., 2020)
The market cost of housing in Bangalore has a fair unique substitution in housing
properties within the former two and half past year. (Rajani & Sinha., 2019. )
The ranking of housing cost is linked not solely to individuals living measures, but also to
overall world economic growth. (Gao, et al., 2017, 2018.)
In Bangalore the domestic house market has clear and astounding swap in its housing
property within the previous two and half past years. In recent times it is discovered as
the likely the most compelling private markets inside the nation, recently, the marketplace
whirling below anxiety. Furthermore, is attempting to hold above water inside the depths
of contracting situations
One of the fundamental public challenges is housing, practically all parts of the world
because of urbanization, gradually more people are moving from urban to rural enivrons
and and this will prompt a tremendous difference between the demand and supply of
housing. (C, et al., 2020) the migrant workers are moving back to own community
because of the pandemic (COVID-19 crisis). (Piao, et al., 2019).the development housing
cost have an influence on the standard of life and real estate business market pattern.
One investigator indicated that investment is the dominant challenge in real estate in
India. (Chaturvedi, et al., 2015,).
C.G. Sathish regulated research in Bangalore, real estate is also proceeded good-
natured in the residential house slice. In accordance with the current study, almost 870
house projects were established in 2018. (C, et al., 2020) the government has estimated
to build 11 million affordable and economical real estate house in urban enivrons by
2022. (Khartit & Khadija., 2021.)
According to the Indian census report as at 2011, 11 million out of 90 million residential
house censuses are accessible,12% of the sum up of the housing property are basically
free houses. Cost of housing predictions use a feature to evaluate and forecast the house
cost. The factors considered are types of houses, total locations, cost of the houses and
so many. The following are the encounters the researches apply, a divergent concept of
machine learning algorithms model. Hence the investigator operated the housing cost
category algorithms might be Decision tree, Random Forest, Logistic regression and
Naïve Bayes these factors can be used to categorized the house cost.
Cities like Bangalore, so many housing real estate are existed in different distinct.
(Durganjali, et al., 2019).

Question: HRPQ11=How many is the basic built-up house location (sq ft)?
Question: RPQ12=Indicate that unit house cost purchased or intending to purchased?

2. Hypothesis Testing
The essence of this research is evaluating the components of real estate predicting and
distinguish the importance of the hypothesis on costing a house. (Chen., 2021).
Null Hypothesis (Not distant between groups):H0
The hypothesis that proposes that no statistical significance exists is referred to as null
hypothesis, referred as H0.
Alternative Hypothesis (there is distant between the groups): H1 This is the proposed
proposition in the hypothesis test, and also the reverse of null hypothesis it negates the
statement.
H0: All levels or groups in housing cost forecast have equal variance
H1: At least one group is different
H0: HRPQ11=HRPQ12: The cost housing with different basic built-up house location is
the same
H1: HRPQ12 HRPQ11: Cost of housing with different basic built-up house locations are
equal.
Expected Result
In predicting house cost there is work plan allocated in a period which has process of
collect and clean data. The preferred algorithms and implements algorithms are selected
by the researches. In addition, interpreting house cost data for predicting all real estate
house cost is done by the researcher using selected machine learning algorithms and
precising the forecasted output. (Bhagat, et al., n.d.)

3. RELATED RESEARCH SURVEY


The basic objective of related work survey is been detailed oriented and ability to identify
the accurate working methodology and technique, which to aids to identify pocket
friendly, considerable, preferable housing cost predicting research.
(Durganjali, et al., 2019) According to their research, they proposed three basic phase
methodology which are reprocessing, modelling and resale price prediction. Collection of
data under the categorization of reprocessing that taken from “Kaggle” website which has
217006 features and also use machine learning algorithms like Logistic Regression to
measure accuracy of the cost, Naïve Bayes and Ada Boost are for predicting the housing
cost, Decision Tree for giving the exact result. (Phan & Danh., 2019) In his (Phan)
research it demonstrates that machine learning algorithm can be used in forecasting
based on geographic variable.
(Li, et al., 2019) stated in a document that evaluation is according to the basic features of
the housing cost and factors that quantified by the Statistical Package for Social Sciences
software. Collection of Data is implemented on specific area and conclude the data
integration with flow chart diagram.
If the resonant is high it indicates that the data variable consist of quantification of
qualitative variable and expected impact is positive, otherwise negative. Lastly that the
semi -logarithmic model which is working around 15 variables inserted into the model are
used for sampling and validating the accuracy of the machine learning algorithms model.
(Jin, et al., 2010 ) Conducted a study, that the several factors of the general housing cost
and the connection between the retail sale and house cost fluctuation is done with time
series analysis and also with regression model.
(Lim, et al., 2016.) , the former worked research carried on predicting the future
condominium house cost and linking the market housing cost and mainly focused on
mainly two machine learning algorithms are autoregressive integrated moving average
(ARIMA) model and artificial Neural network (ANN) model to synthesize the Actual price
and the Predicted price and (Abdulal, et al., 2020) studies shown that the viewed data
cross validation to check the performance the algorithm models.
In research (Ravikumar & Sivam., 2018) indicates that housing cost predicting establish
on the historical data machine learning in Australia and the aim machine learning model
is Support Vector Machine (SVM) and Neural Network (NN). The essence of this
research is to derive the valid information into the Melbourne (Australia) city housing
market cost by evaluating a real historical transactional dataset. In housing cost looking
for the best model that can estimate and predict in setting the real estate house features.
House buyers can earn aid through effective models, real estate agents and informed
decisions. During this research task I got my data from Kaggle website. Forecasting for
the cost of a house, location, land size and number of rooms are former studies derived
from the models.
(Xie, et al., 2007) Indicates that in predicting housing cost that auto-regressive integrated
moving average model (ARIMA) and two non-parametric techniques: artificial neural
networks (NN) and support vector machine (SVM) are necessary. Finally support vector
machine (SVM) machine is comparatively current technique and algorithm to restrict the
dataset and ascribe a label that will unlabeled housing properties data.
Data visualization is an essential instrument of exploring and illustrating the result and
can easily comprehend. (Chaturvedi, et al., 2021)
4. PROPOSED MACHINE LEARNING ALGORITHMS MODEL
Algorithm model is used in this research for classifying as categorical feature variables
and forecasting a price as continues variables. Tools for machine learning algorithms
model and techniques identified to test, train and collect of housing cost.

4.1 ANOVA

A statistical tool one way analysis of variance (ANOVA) was used to evaluate the
statistically variation in the mean of one or multiple independent groups, we tend view it
when there are minimum of three instead of two group.
Feature selection approach rely totally on one way ANOVA F-test valid information is
carried out to pass insignificant attributes from the house price forecasting data
With 95% label of confidence, alpha =0.05, significance is greater than label of
confidence.
Which is P0.05, Hence it is not selected feature for prediction.
𝑌ij = µ + 𝛼𝑖 + €𝑖𝑘
Where i=level of groups (i=1,2,3,4,…N) k= Observations
or duplicates for each group (1,2,3,..r)
€𝑖𝑘: ANOVA is that the errors are independent and identically distributed N(0, σ2)
αi= main effect of groups (depart taken from the µ)
µ=overall population mean(unknown)
Yij=Kth observation of ith level of group
Dependent variable can also be regarded as a variable whose value depends on another
variable where independent variable their value never depends on another variable.
With 95% label of confidence, alpha =0.05, significance is less than label of confidence
which is P<0.05 , Hence it is not selected feature for forecasting.

4.2 LINEAR REGRESSION

Since this tool is the basic and obtainable tool for forecasting, we deplored it in this
project in order derive an effective prediction Analysis. In machine learning algorithm linear
regression is ascribe as one of the known and easiest process.
It is also used in implementing a linear independent variable of collection to forecasting
a continuous dependent variable.
The connection between a single independent variable is described by linear regression.
𝑌 = β0 + β1x + ε
Where β0 and β1 are parameters, ε is a probabilistic error term.
y is dependent feature
x is independent feature
β0 is the intercept or constant

4.3 RANDOM FOREST

It can also be called random decision tree, is a monitored learning algorithm that deals
with partitioning and collection of data to achieve a perfect housing price prediction. In this
process we have different learning models which improves the overall outcome like, train
test, test data, decision tree output.

4.4 DECISION TREE

This statistical tool exhibits the same function as the previous mentioned is a supervised
machine learning where the data is steadily split in accordance to the selected parameters.

4.5 SUPPORT VECTOR MACHINE

(SVM) In this machine learning we used, to split the housing cost feature into parts. It is
also a factor used in machine learning algorithm to explain support vector regression.
In this machine learning I was able to understand the chart of the how the housing data
normalized through the use of train model, test model to obtain an accurate prediction.
In my findings, I observed that this process does not work well if there is missing data, so
is best to assign the missing data before running it.

5. METHODOLOGY
According to the research survey, clearer knowledge of the significant
housing cost forecasting is provided by qualitative and quantitative research
approaches.
Collection of Data is one of the important techniques for deriving data on
specific feature variables.
Qualitative research approach: is one the basic features the researcher identifies in this
survey they are as follows home status, types of residence, customer experience level
with or without cost prediction, proximity of the estate to the city, house cost of the estate
and so on.
Quantitative research approach: is predicting the housing cost and assuring the accuracy
of the cost by using controlled learning algorithms. In most cases, collection of data is
done through Google form, Survey Monkey, Questionnaires and phone interview, from
real estate website in Bangalore, newspapers reviewed, annual report, popular world
labor forum published articles, journals. (Ravikumar & Sivam., 2018, )
Government strategic survey plan report, official housing websites and specific housing
forecasting survey research report are identified as secondary data which are also
referred as raw data obtained from general information archive. (Hayhoe, et al., 2020)
Most of the researcher use design model to implement and apply supervised learning
approach
(Durganjali, et al., 2019) In their research indicates that to compare the cost, categorize
the data and forecasting the real estate house cost. Appropriate technology for real
estate housing market and improvements are data mining and machine learning
algorithms. (Ashok, et al., 2020, ).s
In Bangalore, forecasting is the trend of housing cost and also the largest technology in
that city. (Sheikh, et al., 2019)Most of the researchers describes that data cleaning as an
aid that prevent the dataset from noise free and protect from garbage in garbage out
(Devi, et al., 2015,). Furthermore, Analyzing and preventing from incomplete, unpleasant
error, inconsistent to classify housing cost is also referred to as data cleaning. (Lakshmi
& S., 2018)
In additions, the former researcher is often concentrating on predicting the housing cost
and they consider three factors like data collection, data cleaning, categorizing, training,
testing and lastly selecting the appropriate machine learning, costing the house with
forecasting model and visualize the result.

6. CONCLUSIONS

Research has reviewed in this project that machine learning has correct and
accurate techniques and models for housing cost and predicting a real estate
house in Bangalore city. In order to derive an effective, complete prediction,
illustration, visualizing methods, highly recommended analysis technique, ANOVA
test for categorical features machine learning is appropriate tools.
The investigator derived how to build machine learning algorithms model that
delivers clients with outstanding methods for forecasting future housing costs.
The proposed research technique that delivers outstanding use of the data set are
implementation linear regression algorithms, random forest, decision tree and
using machine learning algorithm in predicting housing cost.

REFERENCES
7.
C, G & Sathish., 2020. 6 Key Trends in Bengaluru’s Real Estate Industry for FY 2020..
s.l.:s.n.
Sinha., R., 2019. . India Real Estate.. s.l.:s.n.
Rajani & Sinha., 2019. . India Real Estate.. s.l.:s.n.
Gao, Jingxin & al., e., 2017, 2018.. Analysis of Factors Influencing the Price of Real
Estate Based on Interpretative Structural Model.. pp. 43–49, ed. s.l.:s.n.
C, G & Sathish., 2020. 6 Key Trends in Bengaluru’s Real Estate Industry for FY 2020..
s.l.:s.n.
Piao, Yong & al., e., 2019. “Housing Price Prediction Based on CNN.” 9th International
Conference on Information Science and Technology. pp. 491–95, ed. s.l.: ICIST, IEEE, .
Chaturvedi, Bhartendu, Kr & Sharma., A., 2015,. “Anticipating and Gearing up Real
Estate Sector in India.”. vol. 4, no. 5, pp. 11–16. ed. s.l.:International Journal of Business and
Management Invention..
C, G & Sathish., 2020. 6 Key Trends in Bengaluru’s Real Estate Industry for FY.. s.l.:s.n.
Khartit & Khadija., 2021.. Indian Real State Industry Analysis Presentation. s.l.: IBEF.
https://www.ibef.org/industry/indian-real-estate-industry-analysis-presentation..
Durganjali, P, M & Pujitha., V., 2019. “House Resale Price Prediction Using
Classification Algorithm". pp 1-4 ed. s.l.:ICSSS, IEEE, .
Chen., D., 2021. Statistical Learning (I): Hypothesis Testing on House Price Dataset | by
Denise Chen | Towards Data Science.. s.l.:i https://towardsdatascience.com/practical-
practice-of-hypothesis- testing-on-house-price-dataset.
Bhagat, Nihar & al., e., 2016. “House Price Forecasting Using Data Mining.”. vol. 152,
no. 2,pp. 23–26 ed. s.l.:International Journal of Computer Applications, .
Bhagat, Nihar & al., e., n.d. “House Price Forecasting Using Data Mining.. vol. 152, no.
2, 2016, pp. 23–26, ed. s.l.: International Journal of Computer Applications.
Durganjali, P, M & Pujitha., V., 2019. “House Resale Price Prediction Using
Classification Algorithms.”. pp.1-4, ed. s.l.:CSSS, IEEE, .
Phan & Danh., 2019. “Housing Price Prediction Using Machine Learning Algorithms:
The Case of Melbourne City, Australia.”. pp. 8-13 ed. s.l.: ICMLDE, IEEE..
Li, Chenghui & al., e., 2019. “Prediction and Empirical Analysis of Residential House
Price Based on Grey Theory - - Taking Huangdao District as an Example.”. s.l.: PHM-
Qingdao, IEEE..
Jin, Han & Weizhong., H., 2010 . “An Empirical Study on Housing Price Volatility and
Retail Sales Growth:. 2nd IEEE International Conference on Information Management ed.
s.l.: IEEE.
Lim, Wan, Teng & al., e., 2016.. “Housing Price Prediction Using Neural Networks.”.
pp. 518-22 ed. s.l.: ICNC-FSKD .
Abdulal, Ahmad & Aghi., N., 2020. Independent Project. s.l.: Faculty of Natural Sciences
House Price Prediction.
Ravikumar & Sivam., A., 2018. “Real Estate Price Prediction Using Machine
Learning.”. pp. 2–19. ed. s.l.: School of Computing National College of Ireland.
Xie, Xiangsheng & Hu., G., 2007. “A Comparison of Shanghai Housing Price Index
Forecasting.”. vol. 3, no. 60674098, 2007, pp. 221–25, ed. s.l.: Proceedings - Third
International Conference on Natural Computation, ICNC .
Chaturvedi, Saumya & al., e., 2021. Real Estate Prediction. s.l.: Preprint No 4926 .
Ravikumar & Sivam., A., 2018, . “Real Estate Price Prediction Using Machine
Learning.”. pp. 2-19 ed. s.l.: School of Computing National College of Ireland, .
Hayhoe, George, F & al., e., 2020. “Analyzing Quantitative Data.”. pp. 56-91 ed. s.l.:A
Research Primer for Technical Communication, .
Durganjali, P, M & Pujitha., V., 2019. “House Resale Price Prediction Using
Classification Algorithms.”. pp.1-4 ed. s.l.: ICSSS, IEEE,.
Ashok, Kulaye, Shreyal & al., e., 2020, . Property Price Prediction Application
Developed Using Machine Learning Algorithm Department of Computer Science Somaiya
Assistant Professor , Dept of Computer Science. no. 2, pp. 2100-15 ed. s.l.: Dept of Computer
Science / IT. .
Ho, et al., 2021. “Predicting Property Prices with Machine Learning Algorithms.”
Journal of Property Research, vol. 38, no. 1, Routledge, 2021, pp. 48–70. vol 38, no 1, pp.
48- 70 ed. s.l.:Routledge.
Sheikh, Wasim & al., e., 2019. Trends in Residential Market in Bangalore ,. no.
November, 2019, ed. Bangalore: .
Devi, Sapna & Kalia., A., 2015,. “Study of Data Cleaning & Comparison of Data
Cleaning Tools.” International Journal of Computer Science and Mobile Computing,. vol. 4,
no 3, pp. 360-70 ed. s.l.:s.n.
Lakshmi & S., 2018. “An Overview Study on Data Cleaning , Its Types and Its Methods
for Data Mining.” International Journal of Pure and Applied Mathematics,. vol. 119, no.
12,pp 16837-48 ed. s.l.: .

You might also like