基于图聚类结果的半监督节点分类方法
Semi-Supervised Node Classification Method Based on Graph Clustering Results
DOI: 10.12677/csa.2024.149183, PDF,    科研立项经费支持
作者: 鲍 舟, 刘 恒:安徽工业大学计算机科学与技术学院,安徽 马鞍山
关键词: 机器学习图卷积神经网络图聚类图节点分类Machine Learning Graph Convolutional Neural Network Graph Clustering Graph Node Classification
摘要: 目前在处理现实世界中知识图谱、引文网络以及社交网络等复杂的图结构数据分类问题上,图卷积神经网络被认为是最有效的半监督方法之一,但存在其学习性能会被严重有限的标记数据影响的问题。本研究针对这一问题提出了一种以图聚类结果作为指导的节点分类方法。具体来说,引入数据增强模块减少了图结构信息中的噪声,设计了一种面向聚类的图嵌入模型作为属性图聚类网络,并根据聚类结果预测出节点的伪标签。同时,为了提升分类任务的性能,筛选出高置信度伪标签来指导图节点分类任务,并设计了一种相似度损失来提高标记节点和未标记节点之间的特征相似度。通过在基准数据集上大量的实验结果表明,与现有的方法相比,该方法可以克服标签数量限制,在图节点分类任务上表现出优越的性能。
Abstract: Currently, graph convolutional neural networks are considered to be one of the most effective semi-supervised methods in dealing with the classification of complex graph-structured data in real-world, such as knowledge graphs, citation networks, and social networks, but there is still the problem that their learning performance can be affected by severely limited labeled data. In this study, a node classification method using graph clustering results as a guide is proposed to address this problem. Specifically, a data enhancement module is introduced to reduce the noise in the graph structure information, a clustering-oriented graph embedding model is designed as an attribute graph clustering network, and the pseudo-labels of the nodes are predicted based on the clustering results. Meanwhile, to improve the performance of the classification task, high-confidence pseudo-labels are screened to guide the graph node classification task, and a similarity loss is designed to improve the feature similarity between labeled and unlabeled nodes. The results of extensive experiments on the benchmark dataset show that the method can overcome the label number limitation and have superior performance on the graph node classification task compared with existing methods.
文章引用:鲍舟, 刘恒. 基于图聚类结果的半监督节点分类方法[J]. 计算机科学与应用, 2024, 14(9): 12-22. https://doi.org/10.12677/csa.2024.149183

参考文献

[1] Kipf, T.N. and Welling, M. (2016) Semi-Supervised Classification with Graph Convolutional Networks.
[2] Perozzi, B., Al-Rfou, R. and Skiena, S. (2014) DeepWalk: Online Learning of Social Representations. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 24-27 August 2014, 701-710. [Google Scholar] [CrossRef
[3] Grover, A. and Leskovec, J. (2016) node2vec: Scalable Feature Learning for Networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 855-864. [Google Scholar] [CrossRef] [PubMed]
[4] Wang, D., Cui, P. and Zhu, W. (2016) Structural Deep Network Embedding. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 1225-1234. [Google Scholar] [CrossRef
[5] MacQueen, J. (1967) Classification and Analysis of Multivariate Observations. In: 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California, Los Angeles, 281-297.
[6] Kipf, T.N. and Welling, M. (2016) Variational Graph Auto-Encoders.
[7] Pan, S., Hu, R., Fung, S., Long, G., Jiang, J. and Zhang, C. (2020) Learning Graph Embedding with Adversarial Training Methods. IEEE Transactions on Cybernetics, 50, 2475-2487. [Google Scholar] [CrossRef] [PubMed]
[8] Wang, C., Pan, S., Hu, R., Long, G., Jiang, J. and Zhang, C. (2019) Attributed Graph Clustering: A Deep Attentional Embedding Approach. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, 10-16 August 2019, 3670-3676. [Google Scholar] [CrossRef
[9] Mrabah, N., Bouguessa, M., Touati, M.F. and Ksantini, R. (2023) Rethinking Graph Auto-Encoder Models for Attributed Graph Clustering (Extended Abstract). 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, 3-7 April 2023, 3891-3892. [Google Scholar] [CrossRef
[10] Zhao, T., Liu, Y., Neves, L., Woodford, O., Jiang, M. and Shah, N. (2021) Data Augmentation for Graph Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 11015-11023. [Google Scholar] [CrossRef
[11] Caron, M., Bojanowski, P., Joulin, A. and Douze, M. (2018) Deep Clustering for Unsupervised Learning of Visual Features. In: Computer VisionECCV 2018, Springer International Publishing, Berlin, 139-156. [Google Scholar] [CrossRef
[12] Xie, Q., Dai, Z., Hovy, E., et al. (2020) Unsupervised Data Augmentation for Consistency Training. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6-12 December 2020, 6256-6268.
[13] Li, Q., Han, Z. and Wu, X. (2018) Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 32, 3538-3545. [Google Scholar] [CrossRef
[14] Tong, H., Faloutsos, C. and Pan, J. (2006) Fast Random Walk with Restart and Its Applications. 6th International Conference on Data Mining (ICDM’06), Hong Kong, 18-22 December 2006, 613-622. [Google Scholar] [CrossRef
[15] Sun, K., Lin, Z. and Zhu, Z. (2020) Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 5892-5899. [Google Scholar] [CrossRef
[16] Hui, B., Zhu, P. and Hu, Q. (2020) Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 4215-4222. [Google Scholar] [CrossRef
[17] Veličković, P., Cucurull, G., Casanova, A., et al. (2017) Graph Attention Networks.
[18] Tao, Z., Liu, H., Li, J., Wang, Z. and Fu, Y. (2019) Adversarial Graph Embedding for Ensemble Clustering. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, 10-16 August 2019, 3562-3568. [Google Scholar] [CrossRef
[19] McCallum, A.K., Nigam, K., Rennie, J. and Seymore, K. (2000) Automating the Construction of Internet Portals with Machine Learning. Information Retrieval, 3, 127-163. [Google Scholar] [CrossRef
[20] Giles, C.L., Bollacker, K.D. and Lawrence, S. (1998) Citeseer: An Automatic Citation Indexing System. Proceedings of the 3rd ACM Conference on Digital Libraries DL ‘98, Pittsburgh, 23-26 June 1998, 89-98. [Google Scholar] [CrossRef
[21] Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B. and Eliassi‐Rad, T. (2008) Collective Classification in Network Data. AI Magazine, 29, 93-106. [Google Scholar] [CrossRef
[22] Salehi, A. and Davulcu, H. (2020) Graph Attention Auto-Encoders. 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), Baltimore, 9-11 November 2020, 989-996. [Google Scholar] [CrossRef