基于改进隐语义模型算法的研究
Research on Improved Latent Semantic Model Algorithm
摘要:
本文融合了逾期因子,改进了传统的隐语义模型,隐语义模型是推荐算法中最常见的一个算法。传统的推荐算法大部分都是根据用户的反馈数据进行训练、建模,然而随着网络时代即将转为数据时代,当面对海量数据时,传统的推荐算法,可能将要面对训练时间长、速度慢、误差大的问题;传统的隐语义模型采用矩阵分解的方法来实现,这种方法最大的优点就是在无需了解分解矩阵因子特征的同时,还能尽可能的提高推荐准确度,但是这种方法需要不断地迭代训练来优化特征向量,训练一次可能需要更大的训练维度和更高的复杂度,以上问题给推荐算法和隐语义模型保留了很大的提升空间。在实际生活中,人们对事物的兴趣很可能会跟随时间的推移而出现变化,当不考虑时间信息的时候,很可能对推荐结果产生影响,推荐的准确率就不一定满足人们的实际需求了。为了提升隐语义模型的效率,本文融合了逾期因子,根据对数函数和反比例函数的特性,完成了对隐语义模型进行改进。通过使用MovieLens数据集进行实验,利用平均绝对误差、均方根误差和损失函数值作为评价指标,改进的隐语义模型对比传统隐语义模型算法的实验结果显示,改进的算法降低了训练维度,提升了训练速度,降低了训练误差,同时也提高了推荐的准确性,有效的改进了传统的隐语义模型算法。
Abstract:
This paper integrates the overdue factor and improves the traditional latent semantic model, which is the most common algorithm in recommendation algorithms. Most of the traditional recommendation algorithms are trained and modeled based on user feedback data. However, as the network era is about to turn into the data era, when faced with massive data, traditional recommendation algorithms may face a long training time the problem of slow speed and large error; the traditional implicit semantic model is implemented by the method of matrix decomposition. The biggest advantage of this method is that it can improve the recommendation accuracy as much as possible without knowing the characteristics of the decomposition matrix factor, but this method requires continuous iterative training to optimize the feature vector. Training once may require larger training dimensions and higher complexity. The above problems leave a lot of room for improvement for recommendation algorithms and latent semantic models. In real life, people’s interest in things is likely to change with the passage of time. When time information is not considered, it is likely to have an impact on the recommendation results, and the accuracy of the recommendation may not meet people’s actual needs. In order to improve the efficiency of the latent semantic model, this paper integrates the overdue factor, and completes the improvement of the latent semantic model according to the characteristics of the logarithmic function and the inverse proportional function. By using the MovieLens data set to conduct experiments, using the mean absolute error, root mean square error and loss function value as evaluation indicators, the experimental results of the improved latent semantic model compared with the traditional latent semantic model algorithm show that the improved algorithm reduces the training dimension and improves the training speed, reduces the training error, improves the accuracy of recommendation, and effectively im-proves the traditional latent semantic model algorithm.
参考文献
|
[1]
|
Kim, J., Kwon, E., Cho, Y. and Kang, S. (2011) Recommendation System of IPTV TV Program Using
Ontology and K-means Clustering. In: Kim, Th., Adeli, H., Robles, R.J. and Balitanas, M., Eds., Ubiquitous Computing
and Multime-dia Applications, UCMA 2011, Communications in Computer and Information Science. Vol. 151, Springer, Berlin, Heidelberg.[CrossRef]
|
|
[2]
|
Le, N.H.N. (2022) Incorporating Textual Reviews in the Learning of Latent Factors for Recommender Systems. Electronic Commerce Research and Applications, 52, Arti-cle ID: 101133. [Google Scholar] [CrossRef]
|
|
[3]
|
Shen, R. (2022) IA Recommender System Inte-grating Long Short-Term Memory and Latent Factor. Arabian Journal for Science and Engineering, 1-11. [Google Scholar] [CrossRef]
|
|
[4]
|
杨春. 基于RBM模型和LFM模型的推荐算法研究与实现[D]: [硕士学位论文]. 重庆: 重庆邮电大学, 2020.
|
|
[5]
|
陈晔, 刘志强. 基于LFM矩阵分解的推荐算法优化研究[J]. 计算机工程与应用, 2019, 55(2): 116-120+167.
|
|
[6]
|
彭宇, 宁慧, 张汝波. LFM基于改进的LFM算法的短视频推荐系统的研究与实现[J/OL]. 应用科技.
https://kns.cnki.net/kcms/detail/23.1191.u.20220217.0943.002.html, 2022-02-17.
|
|
[7]
|
百度百科“对数函数”词条[EB/OL].
https://baike.baidu.com/item/%E5%AF%B9%E6%95%B0%E5%87%BD%E6%95%B0/6013318?fr=aladdin#ref_[1]_331649, 2022-01-22.
|
|
[8]
|
百度百科“反比例函数”词条[EB/OL].
https://baike.baidu.com/item/%E5%8F%8D%E6%AF%94%E4%BE%8B%E5%87%BD%E6%95%B0/3228967?fr%20=%20aladdin, 2021-12-13.
|
|
[9]
|
百度百科“平均绝对误差”词条[EB/OL].
https://baike.baidu.com/item/%E5%B9%B3%E5%9D%87%
E7%BB%9D%E5%AF%B9%E8%AF%AF%E5%B7%AE/9383373?fr=aladdin, 2021-06-24.
|
|
[10]
|
百度百科“均方根误差”词条[EB/OL]. https://baike.baidu.com/item/均方根误差/3498959, 2021-12-13.
|