基于变量选择的房地产价格影响因素分析与预测
Analysis and Prediction of Real Estate Price Influencing Factors Based on Variable Selection
DOI: 10.12677/SA.2022.113062, PDF,    科研立项经费支持
作者: 赖新林:云南财经大学,云南 昆明
关键词: 影响因素房价预测逐步选择Lasso回归Influencing Factors House Price Forecast Stepwise Selection Lasso Regression
摘要: 房地产作为我国国民经济的支柱性产业,其衍生的房价问题一直以来都是一大民生问题。然而近几年由于国内大中小城市房价普遍居高不下,普通人想要买房变得越来越艰难,这就使得研究房价的影响因素十分有必要。本文基于北京市2017年1月至2018年1月份期间的二手房历史交易数据,从15个维度上对北京市的房价进行建模分析。为了挖掘出影响房价的主要因素,进而对房地产行情进行有效估计和预测,本文首先运用多元线性回归方法对房价数据进行建模,并基于逐步选择和Lasso回归方法对初始的十五个预测变量进行变量选择。最后通过提取各个模型的有效信息并比较不同模型的解释性效果和预测效果,得出对房价有较强影响的因素是:社区均价、房屋面积、装修状态、关注人数以及交易时间。同时,在预测性能方面,逐步回归和Lasso回归方法的表现比多元线性回归有较明显的优势。
Abstract: As one of the pillar industries of China’s national economy, real estate has always been a major livelihood issue. In recent years, due to the increasing house prices in large, medium and small cit-ies in China, it has become more and more difficult for most of people to buy a house, which makes it necessary to study the contributing factors of house price. Based on the historical transaction da-ta of second-hand houses in Beijing from January 2017 to January 2018, this paper models and an-alyzes the house prices in Beijing from fifteen dimensions. To dig out the main factors affecting house prices and then effectively estimate and predict the house price, we first analyze house prices by using multiple linear regression model, and then select the important variable among the fifteen predictors based respectively on stepwise selection and Lasso regression. Finally, by ex-tracting the effective information of each model and comparing the explanatory effect and predic-tion effect of different models, the factors that can have an important impact on house prices are: community average price, the square of the house, the renovation condition, followers of the house and the trade time. At the same time, in terms of prediction performance, stepwise regression and Lasso regression have significant advantages over multiple linear regression.
文章引用:赖新林. 基于变量选择的房地产价格影响因素分析与预测[J]. 统计学与应用, 2022, 11(3): 583-594. https://doi.org/10.12677/SA.2022.113062

参考文献

[1] 李晨. 基于因子分析法的中国房价影响因素分析[J]. 经济研究导刊, 2010(16): 158-159.
[2] 张侠, 吴晶晶, 孙道助. 基于线性回归模型的安徽省房价影响因素分析[J]. 阜阳师范学院学报(自然科学版), 2018, 35(4): 73-77.
[3] 陈将浩. 房价影响因素及R语言实现[D]: [硕士学位论文]. 合肥: 中国科学技术大学, 2014.
[4] 杨沐晞. 基于随机森林模型的二手房价格评估研究[D]: [硕士学位论文]. 长沙: 中南大学, 2012.
[5] 李晓童, 郭萱, 王成杰. 基于随机森林方法的北京市二手房价格研究[J]. 数据挖掘, 2017, 7(2): 37-45.
[Google Scholar] [CrossRef
[6] 李函谕, 魏嘉银, 卢友军. 基于随机森林的深圳二手房价格预测与分析[J]. 现代信息科技, 2021, 5(15): 100-104.
[7] [美]加雷斯.詹姆斯, 等, 著. 统计学习导论——基于R运用[M]. 王星, 等, 译. 北京: 机械工业出版社, 2015.
[8] 潘慧峰, 刘曦彤. 限购政策对房地产价格及供求的调控效果研究——以北京市为例[J]. 价格理论与实践, 2017(8): 48-51.