基于相关性建模的图注意力多示例多标签学习
Graph Attention-Based Multi-Instance Multi-Label Learning with Correlation Modeling
DOI: 10.12677/csa.2025.1512336, PDF,    国家自然科学基金支持
作者: 吴嘉欣, 张 健*:中国矿业大学计算机科学与技术学院/人工智能学院,江苏 徐州
关键词: 多示例多标签学习图注意力网络相关性建模对比学习Multi-Instance Multi-Label Learning Graph Attention Network Correlation Modeling Contrastive Learning
摘要: 多示例多标签学习(Multi-Instance Multi-Label Learning, MIML)在图像分类、文本标注等任务中展现了独特优势,但现有方法在建模样本间复杂依赖关系和标签间相关性时仍存在不足。本文提出一种结合图注意力机制与对比学习的相关性建模多示例多标签学习方法(GAMIL-C)。首先,将每个包中的实例构建为图结构,利用图注意力网络(Graph Attention Network, GAT)建模实例间的结构关系,并引入边特征以刻画实例间的相似性。其次,在标签空间中通过相关性感知模块捕捉标签间的高阶依赖,以提升标签预测的准确性。进一步地,我们引入对比学习策略,在实例表征和标签表征之间构建判别性约束,从而增强模型的泛化能力。实验在多个公开数据集上验证了该方法的有效性,与主流基线方法相比,在预测精度和宏平均指标上均取得显著提升。结果表明,该方法能够有效建模实例与标签间的双重相关性,并通过对比学习进一步提升模型鲁棒性,为复杂场景下的多示例多标签学习提供了一种新的解决思路。
Abstract: Multi-Instance Multi-Label Learning (MIML) has shown great potential in tasks such as image classification and text annotation. However, existing approaches still struggle to effectively capture the complex dependencies among instances and the correlations among labels. In this paper, we propose a novel correlation modeling approach for MIML based on graph attention networks and contrastive learning (GAMIL-C). Specifically, we first construct a graph structure for the instances within each bag and employ a Graph Attention Network (GAT) to model the structural relationships among instances, where edge features are incorporated to characterize their similarities. Then, a correlation-aware module is introduced to capture high-order dependencies in the label space, thereby improving label prediction accuracy. Furthermore, a contrastive learning strategy is incorporated to establish discriminative constraints between instance and label representations, which enhances the generalization capability of the model. Extensive experiments conducted on several public datasets demonstrate that our method significantly outperforms strong baselines in terms of prediction accuracy and macro-averaged metrics. The results indicate that the proposed method effectively captures the dual correlations between instances and labels, while contrastive learning further improves model robustness, providing a new perspective for MIML in complex scenarios.
文章引用:吴嘉欣, 张健. 基于相关性建模的图注意力多示例多标签学习[J]. 计算机科学与应用, 2025, 15(12): 209-221. https://doi.org/10.12677/csa.2025.1512336

参考文献

[1] Zhang, Z., Cui, P. and Zhu, W. (2020) Deep Learning on Graphs: A Survey. IEEE Transactions on Knowledge and Data Engineering, 34, 249-270. [Google Scholar] [CrossRef
[2] Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P. and Bengio, Y. (2018) Graph Attention Networks. International Conference on Learning Representations (ICLR), Vancouver, 30 April 2018, 1-12.
[3] Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., et al. (2020) Graph Neural Networks: A Review of Methods and Applications. AI Open, 1, 57-81. [Google Scholar] [CrossRef
[4] Zhou, Y., Zhang, X. and Huang, T. (2019) Multi-Instance Multi-Label Learning with Sparse Feature Selection. IEEE Transactions on Knowledge and Data Engineering, 31, 1093-1106.
[5] Chen, X., Fan, H., Girshick, R. and He, K. (2020) Improved Baselines with Momentum Contrastive Learning. arXiv:2003.04297.
[6] Wang, Y., Mao, C., Wu, C. and Wong, K.C. (2022) Contrastive Graph Learning for Graph Neural Network Pre-Training. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 5511-5525.
[7] Yan, J., Yang, M. and Zhang, C. (2018) Multi-Instance Multi-Label Learning with Applications to Scene Classification. International Journal of Computer Vision, 126, 398-425.
[8] Xu, C., Tao, D. and Xu, C. (2016) Multi-Label Multi-Instance Learning with Application to Scene Classification. Advances in Neural Information Processing Systems (NeurIPS), Barcelona, 5 December 2016, 1-9.
[9] 王立威, 张学工. 多示例多标签学习研究进展[J]. 计算机学报, 2017, 40(8): 1745-1762.
[10] Zhou, Z. and Zhang, M. (2006) Multi-Instance Multi-Label Learning with Application to Scene Classification. In: Schölkopf, B., Platt, J. and Hoffman, T., Eds., Advances in Neural Information Processing Systems 19, The MIT Press, 1609-1616. [Google Scholar] [CrossRef
[11] Ying-Xin Li,, Shuiwang Ji,, Kumar, S., Jieping Ye, and Zhi-Hua Zhou, (2012) Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9, 98-112. [Google Scholar] [CrossRef] [PubMed]
[12] Zhang, M. and Zhou, Z. (2008) M3MIML: A Maximum Margin Method for Multi-Instance Multi-Label Learning. 2008 Eighth IEEE International Conference on Data Mining, Pisa, 15-19 December 2008, 688-697. [Google Scholar] [CrossRef
[13] Zhou, Z., Zhang, M., Huang, S. and Li, Y. (2012) Multi-Instance Multi-Label Learning. Artificial Intelligence, 176, 2291-2320. [Google Scholar] [CrossRef
[14] Zhang, M. and Wang, Z. (2009) MIMLRBF: RBF Neural Networks for Multi-Instance Multi-Label Learning. Neurocomputing, 72, 3951-3956. [Google Scholar] [CrossRef
[15] Zhang, M. (2010) A K-Nearest Neighbor Based Multi-Instance Multi-Label Learning Algorithm. 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, Arras, 27-29 October 2010, 207-212. [Google Scholar] [CrossRef
[16] Li, Y., Hu, J., Jiang, Y. and Zhou, Z. (2021) Towards Discovering What Patterns Trigger What Labels. Proceedings of the AAAI Conference on Artificial Intelligence, 26, 1012-1018. [Google Scholar] [CrossRef
[17] Huang, S., Gao, W. and Zhou, Z. (2019) Fast Multi-Instance Multi-Label Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 2614-2627. [Google Scholar] [CrossRef] [PubMed]
[18] Feng, J. and Zhou, Z. (2017) Deep MIML Network. Proceedings of the AAAI Conference on Artificial Intelligence, 31, 1884-1890. [Google Scholar] [CrossRef
[19] Yang, M., Tang, W. and Min, F. (2022) Multi-Instance Multi-Label Learning Based on Parallel Attention and Local Label Manifold Correlation. 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA), Shenzhen, 13-16 October 2022, 1-10. [Google Scholar] [CrossRef
[20] Qiu, S., Wang, M., Yang, Y., Yu, G., Wang, J., Yan, Z., et al. (2023) Meta Multi-Instance Multi-Label Learning by Heterogeneous Network Fusion. Information Fusion, 94, 272-283. [Google Scholar] [CrossRef
[21] Chen, T., Kornblith, S., Norouzi, M. and Hinton, G. (2020) A Simple Framework for Contrastive Learning of Visual Representations. International Conference on Machine Learning (ICML), 1597-1607.
[22] You, Y., Chen, T., Sui, Y., Chen, T., Wang, Z. and Shen, Y. (2020) Graph Contrastive Learning with Augmentations. Advances in Neural Information Processing Systems (NeurIPS), 33, 5812-5823.
[23] Zhang, M. and Zhou, Z. (2014) A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 26, 1819-1837. [Google Scholar] [CrossRef
[24] Wu, J., Wang, W., Song, L. and Chen, Z. (2021) Multi-Label Graph Convolutional Networks with Label Correlations. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 10078-10086.
[25] Sebastiani, F. (2002) Machine Learning in Automated Text Categorization. ACM Computing Surveys, 34, 1-47. [Google Scholar] [CrossRef
[26] Dai, J., Huang, W., Zhang, C. and Liu, J. (2024) Multi-Label Feature Selection by Strongly Relevant Label Gain and Label Mutual Aid. Pattern Recognition, 145, Article 109945. [Google Scholar] [CrossRef
[27] Zhang, Y., Huo, W. and Tang, J. (2024) Multi-Label Feature Selection via Latent Representation Learning and Dynamic Graph Constraints. Pattern Recognition, 151, Article 110411. [Google Scholar] [CrossRef