基于标签相关性加权嵌入的联邦双阶段注意力神经网络算法研究
Federated Dual-Phase Attention Network with Weighted Label Correlation Embedding for Multi-Label Image Classification
DOI: 10.12677/csa.2025.159243, PDF,    国家科技经费支持
作者: 钟 磊, 姜雪娇, 徐佳隆, 江 蕾:中国南方电网海南电网有限责任公司,海南 海口;曾璐琨*, 艾 渊, 杨景旭:南方电网数字电网集团有限公司,广东 广州
关键词: 多标签图像分类神经网络联邦学习Multi-Label Image Classification Neural Network Federated Learning
摘要: 在涉及隐私敏感数据的多标签图像识别任务中,联邦学习(Federated Learning, FL)模型的有效性至关重要,而在跨区域图像分类中,存在数据分布不一致、类别相关却存在不平衡性的挑战。值得注意的是,现有研究在联邦学习框架下针对类别相关性和不平衡问题的系统性解决方案仍显不足。具体而言,由于不同客户端之间的数据异质性和类别的不平衡性,全局模型的聚合过程面临参数不一致问题,即部分本地模型参数与聚合后的全局模型存在显著偏差,从而影响这些客户端的分类性能。为应对这些挑战,我们提出了一种基于标签相关性加权嵌入的双阶段联邦图注意力神经网络(Federated Dual-phase Attention Network with weighted label correlation embedding for multi-label image classification, FD-WCAT),FD-WCAT的核心创新体现在融合了标签相关性和类不平衡加权的局部模型构建和全局模型的加权聚合两方面。在局部模型构建中,每个客户端构建掩码标签相关图来学习标签相关性特征;然后融合该特征设计了基于类别不平衡加权的多标签分类器。在全局模型聚合时,为解决训练过程中本地模型与全局模型的参数不一致问题,FD-WCAT设计了基于全局–本地参数正则化的双阶段聚合策略:首先,每个客户端计算其类别不平衡系数并将本地模型参数发送至服务器;在服务器端,客户端的模型根据参数相似性被聚类为T组以确保每组内的模型相近,然后通过组内聚合生成T个原型模型。接下来,基于每组的平均不平衡系数计算原型模型的不平衡权重,并通过不平衡加权聚合生成最终的全局模型。最终实验验证了FD-WCAT在多标签数据集上优于现有基准模型。
Abstract: The effectiveness of federated learning (FL) models is crucial for privacy-sensitive multi-label image recognition tasks, while cross-regional image classification faces challenges including inconsistent data distributions and correlated yet imbalanced categories. Notably, existing research still lacks systematic solutions addressing label correlation and imbalance within the FL framework. Specifically, due to data heterogeneity and class imbalance across clients, the global model aggregation process encounters parameter inconsistency issues, where significant deviations exist between some local model parameters and the aggregated global model, thereby impairing classification performance on these clients. To address these challenges, we propose a Federated Dual-phase Attention Network with weighted label correlation embedding for multi-label image classification (FD-WCAT). FD-WCAT’s core innovations manifest in two aspects: (1) local model construction incorporating label correlation and class imbalance weighting, where each client builds a masked label correlation graph to learn label correlation features and integrates this to design a class-imbalance-weighted multi-label classifier; (2) global weighted aggregation employing a dual-phase strategy with global-local parameter regularization. In aggregation: first, each client computes its class imbalance coefficient and transmits local parameters to the server; server-side, client models are clustered into T groups based on parameter similarity to ensure intra-group homogeneity, followed by intra-group aggregation to generate T temporary models. Subsequently, temporary models are assigned imbalance weights based on each group’s average imbalance coefficient, and final global model generation occurs via imbalance-weighted aggregation. Experimental results ultimately validate FD-WCAT’s superiority over existing baseline models on multi-label datasets.
文章引用:钟磊, 曾璐琨, 姜雪娇, 艾渊, 徐佳隆, 杨景旭, 江蕾. 基于标签相关性加权嵌入的联邦双阶段注意力神经网络算法研究[J]. 计算机科学与应用, 2025, 15(9): 267-282. https://doi.org/10.12677/csa.2025.159243

参考文献

[1] McMahan, B., Moore, E., Ramage, D., et al. (2017) Communication-Efficient Learning of Deep Networks from Decentralized Data, Artificial Intelligence and Statistics. The Proceedings of Machine Learning Research, 2017, 1273-1282.
[2] Büyüktaş, B., Weitzel, K., Völkers, S., Zailskas, F. and Demir, B. (2024) Transformer-Based Federated Learning for Multi-Label Remote Sensing Image Classification. 2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, 7-12 July 2024, 8726-8730. [Google Scholar] [CrossRef
[3] Zhang, M.L. and Zhou, Z.H. (2013) A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 26, 1819-1837. [Google Scholar] [CrossRef
[4] Zhang, J., Wei, T. and Zhang, M.L. (2024) Label-Specific Time-Frequency Energy-Based Neural Network for Instrument Recognition. IEEE Transactions on Cybernetics, 54, 7080-7093. [Google Scholar] [CrossRef] [PubMed]
[5] Lanchantin, J., Wang, T., Ordonez, V. and Qi, Y. (2021) General Multi-Label Image Classification with Transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 16473-16483. [Google Scholar] [CrossRef
[6] Zhou, W., Lin, K., Zheng, Z., Chen, D., Su, T. and Hu, H. (2025) DRTN: Dual Relation Transformer Network with Feature Erasure and Contrastive Learning for Multi-Label Image Classification. Neural Networks, 187, 107309. [Google Scholar] [CrossRef] [PubMed]
[7] Liu, I., Lin, C., Yang, F. and Wang, Y.F. (2024) Language-Guided Transformer for Federated Multi-Label Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 13882-13890. [Google Scholar] [CrossRef
[8] Niu, X. and Wei, E. (2023) Fedhybrid: A Hybrid Federated Optimization Method for Heterogeneous Clients. IEEE Transactions on Signal Processing, 71, 150-163. [Google Scholar] [CrossRef
[9] Huang, X., Li, P. and Li, X. (2023) Stochastic Controlled Averaging for Federated Learning with Communication Compression.
[10] Gao, L., Fu, H., Li, L., Chen, Y., Xu, M. and Xu, C. (2022) FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 10102-10111. [Google Scholar] [CrossRef
[11] Li, J., Zhang, C., Zhou, J.T., Fu, H., Xia, S. and Hu, Q. (2021) Deep-LIFT: Deep Label-Specific Feature Learning for Image Annotation. IEEE Transactions on Cybernetics, 52, 7732-7741. [Google Scholar] [CrossRef] [PubMed]
[12] Yu, Z.-B. and Zhang, M.-L. (2022) Multi-Label Classification with Label-Specific Feature Generation: A Wrapped Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 5199-5210.
[13] Jia, B.B. and Zhang, M.L. (2023) Multi-Dimensional Multi-Label Classification: Towards Encompassing Heterogeneous Label Spaces and Multi-Label Annotations. Pattern Recognition, 138, Article 109357. [Google Scholar] [CrossRef
[14] Ahmad, K.M., Liu, Q., Khan, A.A., et al. (2023) Prompt-Enhanced Federated Learning for Aspect-Based Sentiment Analysis. 2023 International Conference on Intelligent Communication and Computer Engineering, Changsha, 24-26 November 2023, 81-87. [Google Scholar] [CrossRef
[15] Gupta, K. and Prasad, R. (2024) Semi Supervised Federated Learning with Pseudo-Labeling. IIIT-Delhi.
[16] Kassem, H., Alapatt, D., Mascagni, P., Karargyris, A. and Padoy, N. (2022) Federated Cycling (FedCy): Semi-Supervised Federated Learning of Surgical Phases. IEEE Transactions on Medical Imaging, 42, 1920-1931. [Google Scholar] [CrossRef] [PubMed]
[17] Qiu, L., Cheng, J., Gao, H., Xiong, W. and Ren, H. (2023) Federated Semi-Supervised Learning for Medical Image Segmentation via Pseudo-Label Denoising. IEEE Journal of Biomedical and Health Informatics, 27, 4672-4683. [Google Scholar] [CrossRef] [PubMed]
[18] Sun, Z., Wu, N., Shi, J., Yu, L., Cheng, K. and Yan, Z. (2024) FEDMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity. In: Lecture Notes in Computer Science, Springer, 394-404. [Google Scholar] [CrossRef
[19] Vondikakis, I.V., Panagiotopoulos, I.E. and Dimitrakopoulos, G.J. (2024) FedRSC: A Federated Learning Analysis for Multi-Label Road Surface Classifications. IEEE Open Journal of Intelligent Transportation Systems, 5, 433-444. [Google Scholar] [CrossRef
[20] Yang, J., Li, S., Zheng, K., Zeng, L., Qi, S., Xu, J., et al. (2025) Label-Specific Feature Based Multi-Label Neural Network for Federated Learning. 2025 5th International Conference on Consumer Electronics and Computer Engineering (ICCECE), Dongguan, 28 February-2 March 2025, 130-136. [Google Scholar] [CrossRef
[21] Chang, S.F., Hsu, B.W.Y., Chang, T.Y., et al. (2023) FLAG: Fast Label-Adaptive Aggregation for Multi-Label Classification in Federated Learning.
[22] Guo, T., Guo, S., Wang, J., Tang, X. and Xu, W. (2024) PROMPTFL: Let Federated Participants Cooperatively Learn Prompts Instead of Models—Federated Learning in Age of Foundation Model. IEEE Transactions on Mobile Computing, 23, 5179-5194. [Google Scholar] [CrossRef
[23] Diao, E., Ding, J. and Tarokh, V. (2022) Semifl: Semi-Supervised Federated Learning for Unlabeled Clients with Alternate Training. Advances in Neural Information Processing Systems, 35, 17871-17884.
[24] Nguyen, D.P., Munoz, J.P. and Jannesari, A. (2024) Flora: Enhancing Vision-Language Models with Parameter-Efficient Federated Learning.
[25] Radford, A., Kim, J.W., Hallacy, C., et al. (2021) Learning Transferable Visual Models from Natural Language Supervision. arXiv.2103.00020.
[26] Song, C., Granqvist, F. and Talwar, K. (2022) FLAIR: Federated Learning Annotated Image Repository. arXiv.2207.08869.
[27] Lin, T.Y., Maire, M., Belongie, S., et al. (2014) Microsoft COCO: Common Objects in Context. In: Lecture Notes in Computer Science, Springer, 740-755. [Google Scholar] [CrossRef