引入模态自适应融合机制的多模态知识图谱推荐方法
Modality-Adaptive Fusion with Knowledge Graph Attention Network for Multi-Modal Recommendation
摘要: 知识图谱(KG)已广泛用于提升推荐系统性能,特别是在缓解数据稀疏问题方面。然而,现有多模态知识图谱推荐方法多采用固定或简单的融合策略(如拼接或线性加权),难以充分建模不同模态对推荐决策的动态贡献,限制了异构信息的有效利用。为此,本文提出一种新颖的多模态自适应融合知识图谱注意力网络(MAF-KGAT)。该模型引入模态自适应门控机制,通过门控网络动态学习图像特征与结构嵌入的融合权重,实现更细粒度的模态融合。我们还结合预训练的CLIP图像编码器提取判别性视觉特征,并在KGAT架构基础上融合多模态信息以增强图谱传播能力。在AmazonBooks和MovieLens数据集上,MAF-KGAT在Recall@20和NDCG@20上分别提升了4.2%和3.5% (AmazonBooks)以及3.6%和2.9% (MovieLens),尤其在冷启动场景下表现出更强的推荐能力,验证了其在个性化推荐中的有效性与鲁棒性。该方法已在电商和娱乐平台中验证,可有效缓解数据稀疏问题。
Abstract: Knowledge Graph (KG) has been widely used to improve the performance of recommendation systems, especially in alleviating the problem of data sparsity. However, most of the existing multimodal knowledge graph recommendation methods adopt fixed or simple fusion strategies (such as concatenation or linear weighting), making it difficult to fully model the dynamic contributions of different modalities to recommendation decisions and limiting the effective utilization of heterogeneous information. To this end, this paper proposes a novel multimodal adaptive fusion knowledge graph attention network (MAF-KGAT). This model introduces a modal adaptive gating mechanism. Through the gated network, it dynamically learns the fusion weights of image features and structural embeddings to achieve more fine-grained modal fusion. We also combine the pre-trained CLIP image encoder to extract discriminative visual features, and fuse multimodal information based on the KGAT architecture to enhance the graph propagation ability. Experiments on two real datasets, Amazonbook and MovieLens, show that MAF-KGAT significantly outperforms existing multimodal and knowledge graph recommendation methods in terms of metrics such as Recall and NDCG, especially demonstrating stronger recommendation capabilities in cold start scenarios. Its effectiveness and robustness in personalized recommendation have been verified.
文章引用:李若雯, 徐晓婧. 引入模态自适应融合机制的多模态知识图谱推荐方法[J]. 计算机科学与应用, 2026, 16(3): 26-39. https://doi.org/10.12677/csa.2026.163084

参考文献

[1] Koren, Y., Bell, R. and Volinsky, C. (2009) Matrix Factorization Techniques for Recommender Systems. Computer, 42, 30-37. [Google Scholar] [CrossRef
[2] Wang, X., He, X., Cao, Y., Liu, M. and Chua, T. (2019) KGAT: Knowledge Graph Attention Network for Recommendation. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, 4-8 August 2019, 950-958. [Google Scholar] [CrossRef
[3] Wang, H., Zhao, M., Xie, X., Li, W. and Guo, M. (2019) Knowledge Graph Convolutional Networks for Recommender Systems. The World Wide Web Conference, San Francisco, 13-17 May 2019, 3307-3313. [Google Scholar] [CrossRef
[4] Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., et al. (2020) Graph Neural Networks: A Review of Methods and Applications. AI Open, 1, 57-81. [Google Scholar] [CrossRef
[5] Hu, H., Guo, W., Liu, Y. and Kan, M. (2023) Adaptive Multi-Modalities Fusion in Sequential Recommendation Systems. Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, 21-25 October 2023, 843-853. [Google Scholar] [CrossRef
[6] Yi, Z. and Ounis, I. (2024) A Unified Graph Transformer for Overcoming Isolations in Multi-Modal Recommendation. 18th ACM Conference on Recommender Systems, Bari, 14-18 October 2024, 518-527. [Google Scholar] [CrossRef
[7] Wei, Y., Wang, X., Nie, L., He, X., Hong, R. and Chua, T. (2019) MMGCN: Multi-Modal Graph Convolution Network for Personalized Recommendation of Micro-Video. Proceedings of the 27th ACM International Conference on Multimedia, Nice, 21-25 October 2019, 1437-1445. [Google Scholar] [CrossRef
[8] Sun, R., Cao, X., Zhao, Y., Wan, J., Zhou, K., Zhang, F., et al. (2020) Multi-Modal Knowledge Graphs for Recommender Systems. Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Lyon, 23-27 April 2018, 19-23 October 2020, 1405-1414. [Google Scholar] [CrossRef
[9] Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., et al. (2021) Learning Transferable Visual Models from Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, 139, 8748-8763.
https://proceedings.mlr.press/v139/radford21a.html
[10] Ai, Q., Azizi, V., Chen, X. and Zhang, Y. (2018) Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation. Algorithms, 11, Article 137. [Google Scholar] [CrossRef
[11] Wang, H., Zhang, F., Zhao, M., Li, W., Xie, X. and Guo, M. (2019) Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation. The World Wide Web Conference, San Francisco, 13-17 May 2019, 2000-2010. [Google Scholar] [CrossRef
[12] Wang, H., Zhang, F., Xie, X. and Guo, M. (2018) DKN: Deep Knowledge-Aware Network for News Recommendation. Proceedings of the 2018 World Wide Web Conference on World Wide Web—WWW’18, Lyon, 23-27 April 2018, 1835-1844. [Google Scholar] [CrossRef
[13] Wang, X., Huang, T., Wang, D., Yuan, Y., Liu, Z., He, X., et al. (2021) Learning Intents behind Interactions with Knowledge Graph for Recommendation. Proceedings of the Web Conference 2021, Ljubljana, 19-23 April 2021, 878-887. [Google Scholar] [CrossRef
[14] Chang, C., Zhou, J., Weng, Y., Zeng, X., Wu, Z., Wang, C., et al. (2023) KGTN: Knowledge Graph Transformer Network for Explainable Multi-Category Item Recommendation. Knowledge-Based Systems, 278, Article ID: 110854. [Google Scholar] [CrossRef
[15] Guo, Z., Li, J., Li, G., Wang, C., Shi, S. and Ruan, B. (2024) LGMRec: Local and Global Graph Learning for Multimodal Recommendation. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 8454-8462. [Google Scholar] [CrossRef
[16] Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D. and Rosenblum, D.S. (2019) MMKG: Multi-Modal Knowledge Graphs. In: Hitzler, P., et al., Eds., The Semantic Web, Springer, 459-474. [Google Scholar] [CrossRef
[17] Zheng, S., Wang, W., Qu, J., Yin, H., Chen, W. and Zhao, L. (2023) MMKGR: Multi-Hop Multi-Modal Knowledge Graph Reasoning. 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, 3-7 April 2023, 96-109. [Google Scholar] [CrossRef
[18] Zhang, Y. and Zhang, W. (2022) Knowledge Graph Completion with Pre-Trained Multimodal Transformer and Twins Negative Sampling. arXiv: 2209.07084.
[19] Wu, J., Wang, X., Feng, F., He, X., Chen, L., Lian, J., et al. (2021) Self-Supervised Graph Learning for Recommendation. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 11-15 July 2021, 726-735. [Google Scholar] [CrossRef
[20] Liu, C., Yu, C., Gui, N., Yu, Z. and Deng, S. (2023) SimGCL: Graph Contrastive Learning by Finding Homophily in Heterophily. Knowledge and Information Systems, 66, 2089-2114. [Google Scholar] [CrossRef
[21] Cai, X., Huang, C., Xia, L., et al. (2023) LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation. arXiv: 2302.08191.
https://openreview.net/forum?id=FKXVK9dyMM
[22] Lee, J., Wang, Y., Li, J. and Zhang, M. (2024) Multimodal Reasoning with Multimodal Knowledge Graph. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, 11-16 August 2024, 10767-10782. [Google Scholar] [CrossRef
[23] Tang, X., Sun, T., Zhu, R. and Wang, S. (2021) CKG: Dynamic Representation Based on Context and Knowledge Graph. 2020 25th International Conference on Pattern Recognition (ICPR), Milan, 10-15 January 2021, 2889-2895. [Google Scholar] [CrossRef
[24] Wang, H., Zhang, F., Zhang, M., Leskovec, J., Zhao, M., Li, W., et al. (2019) Knowledge-Aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, 4-8 August 2019, 968-977. [Google Scholar] [CrossRef
[25] He, X. and Chua, T. (2017) Neural Factorization Machines for Sparse Predictive Analytics. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, 7-11 August 2017, 355-364. [Google Scholar] [CrossRef
[26] Zhang, F., Yuan, N.J., Lian, D., Xie, X. and Ma, W. (2016) Collaborative Knowledge Base Embedding for Recommender Systems. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 353-362. [Google Scholar] [CrossRef
[27] Wang, H., Zhang, F., Wang, J., Zhao, M., Li, W., Xie, X., et al. (2018) RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, 22-26 October 2018, 417-426. [Google Scholar] [CrossRef