House Price Prediction Model Based on Neural Network

Through an in-depth understanding of house price prediction issues, the paper aims to establish a BP neural network model for house price prediction based on ideas and methods of the BP neural network. By the BP neural network method, the paper realizes sorting, statistics, and analysis of house prices in Chongqing for 10 years as well as the main factor analysis of house prices. Then, correlative analysis is conducted with the software on house prices and their influential factors. Significant correlations are manifested. MATLAB was used to compile relevant programs and a BP neural network model was established for prediction and verification of house prices. Through the market survey, house prices are finally predicted by software.


Introduction
House prices concern the national economy and the people's livelihood, involving interests and resources of different aspects, the especially optimized configuration of scares resources. Therefore, house price prediction appears to be quite important. Also, with the prosperity of the real estate industry and further perfection of the real estate market, it is necessary and urgent to realize the prediction of house prices. House price prediction exists in the whole lifecycle of real estate. In different stages such as real estate investment, development, construction, sales, distribution, leasing, sub-leasing, transfer, mortgage, gifting, expropriation, insurance, taxation, and dismantling, corresponding price prediction is needed, based on which decisions and actions can be implemented. Also, house prediction plays an important role in keeping sustainable and healthy development of the real estate market, which is mainly manifested in aspects such as promoting the healthy development of real estate credit and loans, preventing financial risks, safeguarding legal rights and interest of relevant parties, safeguarding national and public interests, reducing management cost and providing comprehensive services for the real estate market, etc. Also, with the continuous development of the social economy, the house price prediction will be extended to more fields. Prices of real estate are influenced by many factors, while they have different influential directions, degrees, and relations on real estate prices. Some factors can be completely measured with math models, but others cannot. Some of them can only be judged by prediction staffs based on experience. Therefore, house price prediction is an important part of huge difficulties, in need of high technologies in the real estate industry. Therefore, it is of great significance to realize in-depth and systematic research on real estate prediction and perfection and optimization of theories and methods of house price prediction [1][2][3].

Meaning of Prediction
In actual life, the prediction is widely applied. For example, historical materials of a period in the economic system or management system can be used to predict trends of the system development. Prediction means that development trends and results of future events are deduced based on certain scientific methods and logistic deduction as well as qualitative or quantitative calculation and exploration of evolution rules of things. Prediction means the estimation of the future of one thing based on its past and present and deduction of unknown things according to known things. The accuracy of prediction completely depends on the acquisition degree of motion rules of the researched courses. The more complete and deeper acquaintance of objective rules will facilitate more accurate predictions. The accuracy degree of prediction also relies on the method used for prediction. Prediction serves decision making. It is an important manner to avoid blindness and enhance conscientiousness and scientific properties. Correct decision making is sourced from accurate information and scientific prediction. Therefore, the prediction is deemed as the premise or basis of decision making. For example market prediction is the basis of production decision making [4][5]. Prediction is also the basis for formulating development plans. For example, population prediction is the basis for formulating development plans concerning housing and highways, etc. Prediction of technological progress and scientific development can promote the updating of products and enhance the competitiveness of products. It is an important approach for operation and management. A prediction course is an acquaintance course of things. Therefore, prediction can enhance the manager's acquaintance of things, to enhance his/her foreseeability in work, and enhance operational and management level [6].

Comparison of House Price Prediction Methods
Common house price prediction methods include house price prediction with regression analysis, prediction with a gray prediction mode, a prediction based on the Elman neural network, and prediction with a neural network model [1,7].

Prediction Based on Regression Analysis
Regression prediction means that a regression equation between the independent variable and the dependent variable is sought according to changing rules of historical data, and model parameters are confirmed, based on which prediction is realized. According to the number of independent variables, the regression issue can be divided into unitary regression and multiple regression. The regression analysis method is generally applicable to middle-phase prediction [8][9]. The regression analysis method is mainly characterized in that: (1) Technologies are relatively mature, and prediction courses are simple; (2) Influential factors of a prediction object are decomposed, and changes of different factors are investigated so that the quantity state of the prediction object is estimated; (3) The regression model has a large error and poor extrapolation. When influential factors are complicated or relevant factor data cannot be obtained, the error of the regression model cannot be corrected even though the addition of computation quantity and complexity. As requested by the regression analysis method, the sample size must be large, while samples must have good distribution rules. When the prediction length is larger than the occupied original data length, prediction with this method theoretically cannot guarantee the accuracy of prediction results. Also, the quantified results may go against qualitative analysis results. Sometimes, a proper regression equation type cannot be found[10].

Prediction Based on Gray System Theories
Gray's theory was proposed by Professor Deng Julong in the 1980s. It was firstly applied in control theories. Its basic idea is to integrate known time-correlated data sets, thing sets, decision sets, and relation sets, combine them according to certain rules, constitute a dynamic or non-dynamic combination, form white modules, and solve future gray modules through certain conversion and solutions. As believed in the gray theories, the objective system features complicated manifestations and discrete and chaotic data, it has an overall function anyway and is always orderly. Hence, it must have certain internal rules [11]. Different from the traditional method of random change data sequence processing, the gray theory deems all the random variables as gray variables changing within a certain scope. Gray variables are not treated with big sample analysis and research from the perspective of statistical rules. Instead, they are generated and restored through data processing methods. The chaotic original data is sorted into generation data with strong rules for research. In other words, the gray system theory establishes a data model rather than an original data model. Through original data processing and gray model establishment, the gray prediction finds and masters system development rules and makes scientific and quantitative predictions of future states. Its prediction model is an exponential function. If the object to be measured develops with an exponential rule, it is predicted to obtain prediction results with high accuracy [12]. In the actual application of many fields, a lot of cases succeed in model prediction but are also limited in high prediction deviations. Key factors influencing the accuracy and adaptability of model prediction include the construction of background values in a model and selection of initial values in a prediction formula. Some articles study this aspect and propose formulas for background values in new construction models as well as a selection of initial values, which to some extent have increased the accuracy of prediction models. However, the improvement is not thorough. Further studies are still needed [13]. Next, we use the GM(1,1) model to predict 10-year house prices in Chongqing through MATLAB software programming [2,7,14] (see appendix 2 for the program) and obtain prediction values. Actual values and prediction values are shown in the following table: As shown in the comparison results, the errors between actual values and predicted values are large.

Prediction Based on Elman Neural Network
An Elman neural network is generally divided into four layers: an input layer, hidden layer, receiving layer, and output layer. The connection of the input layer, hidden layer, and output layer is similar to a feedforward network. Units in the input layer only play a role in signal transmission. Units of the output layer play a role in linear weighting. A linear or nonlinear function can be used as a transmission function of units in the hidden layer. The receiving layer is also called a context layer or a state layer. It is used to memorize the output value of units in the hidden layer at the previous moment and return it to the network input. It can be deemed as a one-step delay operator. The Elman neural network is characterized in that output of the hidden layer is independently connected to the input of the hidden layer through the delay and storage of the receiving layer. Through such a model of independent connection, it is sensitive to historical state data. Through the addition of the internal feedback network, the network is more capable of creating dynamic information, so that dynamic modeling is realized. Also, the Elman neural network can approach any nonlinear mapping at any accuracy. The specific form for external noise to influence the system does not need to be considered. The system modeling can be realized when the input-output data pair of the system is given [15].  [16].

Prediction Based on Neural Network
Theoretical research of ANN (Artificial Neural Networks) is an emerging marginal and interdisciplinary science). It can represent nonlinear relations, learning, or the like, providing new ideas and new methods for many actual problems with complicated uncertainties and time-dependent features. In the field of prediction, Lapeds and Farber firstly used neural networks for prediction in 1987, initiating prediction of ANN.
Most economic models established based on econometrics are linear models. Theoretically, a neural network can approach any nonlinear function and can be adjusted randomly, so that the non-linear prediction problems can be solved effectively. The learning function of ANN is used to train neuron networks with a lot of samples. Its connecting value and the threshold value are adjusted. Then, the confirmed model can be used for prediction. A neural network can automatically learn previous experience from data samples without complicated queries or expressions. It automatically approaches the best function which scores sample data rules. Despite forms of these functions, the feature of the neural network is more obvious when the function form manifested by the considered system is more complicated. The basic idea of error backpropagation (BP algorithm) is to adjust and amend the connecting weight and threshold of the network through the backpropagation of network errors to minimize the errors. Its learning course includes forward computation and error backpropagation. By using a simple three-layer ANN model, it can realize any complicated nonlinear mapping relation from input to output. As theoretically proved by Kolmogorov already, a neural network with one hidden layer (there are enough hidden nodes as assumed) can express any continuous function at any accuracy; K.Hor et al. also verified that the neural network can randomly approach a large category of functions and can reveal nonlinear relations hidden in data samples [17]. In recent years, as a nonlinear model, the neural network model has been used to study prediction issues. Because of its features, the neural network model is a data-driven method. At present, the neural network model has been successfully applied to many fields, such as many economic fields like economic forecasts, fiscal analysis, loan mortgage evaluation, and bankruptcy prediction. It is applied to the prediction field.

Analysis of Advantages of Neural Network Prediction Method
Various applications of neural networks are available [18][19][20][21]. These applications exists from diverse perspective.
(1) A different modeling method is used. In general, the quantitative prediction is a prediction method established based on statistical analysis, which requests complete and definite original data. In an actual system, statistical data is often incomplete and fuzzy. A neural network is driven by data and realized by black-box modeling, without prior (statistical knowledge) information. In a complicated data environment without accurate and complete information sources, it can extract data features through adjustment of its structure and then make an effective prediction of the future.
(2) General prediction methods cannot realize learning and pattern recognition of data samples in modeling. Its modeling is an abstracting course of original data. This abstract course is completed in numerical computation. The abstract results can be completed by a complete math analysis formula. The neural network model is an "image" memory course. By the learning course of the network model, it completes the "memory" of internal relations with the data. The trained neural network model is not meant for the explanation of input and output relations.
(3) Once being established, traditional prediction models have high structural stability. They are limited in processing scales. The actual prediction environment is complicated and varying.
In the face of a complicated and varying prediction environment, traditional prediction models cannot be adapted to new relations of actual system variables. As a varying structure mode, the neural network model is capable of self-adaptation and parallel processing, so that it has a strong data processing ability. Also, through network learning of new samples, its internal structure can be adjusted, so that it can be adapted to changes in system variables. Hence, as for the non-linear high-dimensional and high-order issues, the neural network can play a more capable role.
(4) General prediction methods do not have a learning course of samples, so the fitness for samples is relatively low. The learning feature of a neural network model decides its maximum fitting with samples. Also, the neural network model has a fault-tolerant ability. Through the training of the neural network model, the system variable relations can be highly fitted, so that correlated influences between system variables can be analyzed. Therefore, rules of relations between system variables can be found and can be applied to the decision making of actual issues. To correctly reflect expression forms of a system in different stages, a traditional prediction model must set different virtual variables for model correction. The ANN is advantageous in that the structure and functions such as information processing and retrieval of a brain neural system can be simulated based on different degrees and layers. It is highly self-adaptive to a lot of non-structural and non-accurate rules and is characterized by information memory, independent study, knowledge deduction, and optimized computation. It prevails over conventional algorithms and expert system technologies in unique owning of self-learning and self-adaptation functions.
In general, a learning algorithm is constituted of forwarding propagation and backpropagation. It is assumed the input vector u of a known network has n dimensions, while the output vector y has m dimensions, and the length of the given input/output sample pair is L. The BP learning algorithm is implemented by the following steps: (1) An initial weight system W(0) is set, which is a small random non-zero value; (2) An input/output sample pair is given to compute the output of the network. Asset, the p-th group of sample input: = ( 1 , 2 , ⋯ , ), output: = ( 1 , 2 , ⋯ , ). When the node i is input in the p-th group of samples, the output is .
denotes the j-th input of the node i when it is input in the p-th group of samples; (1) is a differentiable S-type action function formula ( ) = 1 1+ − . The output of the network output node can be computed from the input layer via the hidden layer to the output layer.
(2) The network target function J is computed. It is assumed is the target function of the network in case of input of the p-th group of samples. Here, the L2 norm is selected, so that: Where: when ( ) is subject to the input of the p-th group of samples, the network output is adjusted after t weight adjustments; k is the k-th node of the output layer. Total target function of the network: ( ) = ∑ ( )(3) It is taken as the evaluation of network study effects.
(3) As judged, the algorithm is ended when ( ) ≤ ; otherwise, the step (5) is started. In the formula, is determined in advance, > 0.
Where: is the sensitivity of the state of the i-th node to As shown in Formula (5) and Formula (6): can be computed based on the following two cases. 1. If i is an output node, namely i=k, it can be known according to Formula (2) and Formula (6): 2. If i is not an output node, namely i is not equal to k, so the Formula (6) is then converted to: In the formula, m1 is the m1-th node of the layer in succession to the node i; * is the j-th input of the node m1. When i=j, = * Formula (10) and Formula (11) are substituted into Formula (5), then: As found, the Formula (9) and Formula (12) can be used for weight adjustment computation of Formula (4).

House Price Prediction Model of BP Neural Network
House price is influenced by factors of various aspects, so a lot of factors need to be considered during the establishment of a house price prediction model. The BP algorithm manifests strong processing capability during the processing of a multi-input nonlinear system, so here the BP algorithm is used for prediction.

Principles of BP Neural Network Prediction Model
In applications such as system modeling, discrimination, prediction, and the like, as for a linear system, the black box input/output model can be effectively expressed in transfer function matrices in the frequency domain; in the time domain, based on various parameter estimation methods, the ARMA model can give systematic descriptions of input/output, so that the problem of linear system prediction is solved perfectly. As for a nonlinear system, a non-linear self-regression based moving average model is generally used for prediction. However, it is not easy to find a proper parameter estimation method for this model. Hence, it is hard to realize the discrimination of traditional nonlinear systems in theoretical research and actual applications. Neural networks manifest obvious superiorities in this aspect. Capable of approaching any non-linear mapping by learning, neural networks can be applied to modeling and discrimination of nonlinear systems. They can get rid of the limits of nonlinear models, so that learning algorithms easy to implement in engineering can be given conveniently. Through learning and training of a lot of dispersing experimental data, field knowledge of neural networks can be extracted and expressed as sizes and distribution of network connecting weights. A system model reflecting the internal rules of an actual course can be established so that effective solutions can be provided for technological and information modeling. A prediction modeling course based on a BP network is shown in the following diagram. Fig.1 Flow chart of the predictive modeling system As shown in Fig.1, in the predictive modeling system, the sample data set obtained in the experiment is taken as a training sample to train the neural network. After training, the model representing the neural network as well as weights and thresholds of knowledge parameters are acquired. Then, the requested testing data is taken as the input of the neural network in the neural network training of acquired weights and thresholds, so that the requested prediction modeling output is acquired. As an effective intelligent information processing technology, neural networks can realize modeling according to internal associations of data and have the good nonlinear approaching capability and the ability for comprehensive processing of chaotic information, which makes them become a powerful tool for exploring the secrets of AI. They've achieved remarkable achievements in many fields such as signal processing, pattern recognition, and control and have been applied widely. The neural network has a very strong nonlinear approaching capacity and features such as self-leaning and self-adaptation so that it does not need the establishment of display relations and math models of a complicated nonlinear system. Many limits and difficulties of traditional quantitative prediction methods can be overcome, while the influences of many manual factors can be avoided. Hence, neural network technology is applied to house price prediction. It enjoys unique advantages in rationality and applicability of construction of a house price prediction model. A neural network model features massive parallelism, storage distribution, structural variability, high-degree nonlinearity, self-leaning, and self-organization, etc. It can approach functions which can most effectively depict sample data rules, despite forms of these functions. The extensive adaptability, learning ability, and mapping ability of the neural network manifest certain superiority in prediction through study and mastering of inter-data dependence relations. It prevails over traditional methods which are very accurate because of dependence on the deduction of math models and parameter optimization but are also limited due to the same reason.

Basic Steps for Application of BP Neural Network in House Price Prediction
House price prediction with the BP neural network can be divided into training and prediction of the neural network. Specific steps are as follows: (1) Training sample data is selected to construct the training sample. House prices are influenced by many economic factors. Also, future trends of house prices may be influenced by manual roles, governmental regulation, and the like. Therefore, it is necessary to select proper house price sample data, otherwise, the network prediction ability will be reduced by improper data.
(2) House price sample data is pre-processed. Before network prediction, to avoid the neural network paralysis caused by overlarge original data, we normalized the house price sample data. Given the large changing range, the predicted values should not be taken as the network input, so they were also normalized. The data is normalized into [-1,1], so that data smoothness could be realized and the noise of prediction results could be eliminated.
(3) Training samples are constructed. House price prediction is based on a lot of historical data. Here, we mainly selected GDP, land price, RMB exchange rate (USD to RMB), urban population, and per capita disposable income of urban residents which could influence house prices during 2000-2009. Data from 2000-2008 was taken as the model training sample. The data of 2009 was taken as the testing sample. (4) A BP neural network with a 3-layer structure is used for prediction model establishment.
(5) The network model is tested. The testing sample is used for testing the trained model.

Implementation of House Price Prediction with BP Neural Network
The selection of influential factors on house prices should be scientific and rational and must of representative and practical significance. There are a lot of influential factors on house prices. It is not practical or meaningful to research and analyze all the influential factors. Also, some of the influential factors on house prices must be qualitatively analyzed, such as house-dependent factors, house supply-demand factors, some national macro-regulation policies; others can be quantitatively analyzed, such as GDP, land price, interest rate, and per capita disposal income. During the selection of influential factors on house prices, it is necessary to consider factors which can influence house prices, which must be easy for quantified processing. It is not objective to find all the influential factors on house prices. Analysis of all the influential factors on commercial housing prices is not realistic. Here, we only research house prices in Chongqing under factors of GDP, land price, RMB exchange rate (USD to RMB), urban population, and per capita disposal income of urban residents (see table 2 for relevant data). Here, a BP neural network model form with five inputs, one hidden layer, and one output is established. Too many nodes of the hidden layer will lead to extra-long learning time. If there are only a few nodes in the hidden layer, the fault tolerance will be low, so the ability to recognize unlearned samples is poor. Hence, the quantity of the hidden layer is generally determined with the following formula based on previous experience: = √ − 0 + , where is the quantity of nodes in the hidden layer; is the number of input nodes; 0 is the number of output nodes; is a constant between 1-10; Data from 2010 to 2018 was taken as the training set sample for the training of the neural network so that the neural network model needed was obtained. Here, it is assumed that the house price in 2019 is unknown. Then, influential factor variables of 2019 were input into the neural network model we trained, so that the predicted value was obtained. Then, whether the error between the predicted value and the actual value could reach the expected error scope is confirmed. At first, the original data should be standardly normalized (processes shown in the appendix. The processed data is as follows: Average House Price (RMB/m 2 ) Design and training of network: the network is designed as a BP neural network with a hidden layer. There are 5 neurons in the input layer of the network. There is 1 neuron in the output layer. According to previous experience, we finally set 10 as the number of neurons in the hidden layer. After standard normalization, the input vector scope of the network is [0,1]. Hence, transit is used as the transfer function of neurons in the hidden layer, while logs are used as the transfer function of neurons in the output layer, which exactly satisfies the requirement that network output is within [0,1]. Next, the MATLAB software programming is used for design and training of the BP neural network. The processes are as follows:  Y_test= 0. 9991 The predicted value obtained by the BP neural network model we established is 0. 9991, with the accuracy reaching 99. 91%, fully demonstrating that the house price prediction result obtained by the BP neural network is relatively ideal. The prediction result is obtained through Matlab programming. It is thus the clear prediction of the house price with the BP neural network is highly reliable and relatively accurate.

Conclusions
ANN (Artificial Neural Network) is a network-based on a large-scale parallel structure and distributed storage information, with good fault tolerance and robustness. The BP neural network is the most widely applied network among ANN. Capable of realizing self-organization and self-learning, it has been widely applied in prediction, simulation, and other aspects. The paper elaborates status quos of foreign and domestic house prices as well as theoretical knowledge of ANN and BP neural networks and analyzes in detail the influential factors on house prices. We conduct statistics, analysis, and processing of 10-year house prices in Chongqing and influential factors and then used the BP neural network model for house price prediction. The result verifies that it is feasible and highly accurate to predict a house price with the BP neural network. The house price prediction method based on the BP neural network adopted in the paper is helpful to some extent for guiding house prices. Also, with the continuous progress of modern science in China and the continuous deepening of theoretical knowledge, the occurrence of new prediction methods will lay a solid theoretical basis and provide a better math model for prediction. It is convinced the prediction accuracy will see a remarkable leap with the deepening in house price prediction studies.