基于Swin Transformer和双聚焦相似度的图卷积标签传播网络小样本分类
Few-Shot Classification with Graph Convolutional Label Propagation Network Based on Swin Transformer and Dual-Focus Similarity
摘要: 小样本学习旨在通过少量标注样本高效识别新类别,然而,现有方法在处理数据稀缺性、关系建模精准性以及图结构泛化能力等问题时仍存在局限性。为此,文章提出了一种基于Swin Transformer和双聚焦相似度的图卷积标签传播网络(Swin Transformer-Based Graph Convolutional Label Propagation Network, ST-GCLPN)小样本分类方法。首先,利用Swin Transformer提取输入样本的全局与局部特征,缓解数据稀缺导致的特征不足问题。其次,通过双聚焦相似度综合全局和局部特征关系,构建精准的图结构,并输入图卷积网络(GCN)以捕获局部邻域特征。随后,GCN的输出作为输入传递到标签传播算法(LPA)中,通过全局图结构高效传播标签信息,增强分类泛化能力。最后,计算查询集的预测标签与真实标签之间的交叉熵损失,优化模型参数。实验结果表明,在miniImageNet数据集的5-way 1-shot和5-way 5-shot任务中,所提方法的分类准确率均有提高,显著优于当前主流方法。该方法有效解决了小样本学习中特征不足、关系建模不精准及泛化能力有限等问题,为小样本分类任务提供了一种高效而鲁棒的解决方案。
Abstract: Few-shot learning aims to efficiently recognize new categories with a limited number of annotated samples. However, existing methods still face limitations in addressing issues such as data scarcity, the precision of relational modeling, and the generalization capability of graph structures. To this end, this paper proposes a few-shot classification method based on a Graph Convolutional Label Propagation Network (GCLPN) that incorporates Swin Transformer and dual-focus similarity. Firstly, the Swin Transformer is utilized to extract both global and local features from input samples, alleviating the issue of insufficient features caused by data scarcity. Secondly, the dual-focus similarity integrates global and local feature relationships to construct an accurate graph structure, which is then fed into a Graph Convolutional Network (GCN) to capture local neighborhood features. Subsequently, the output of the GCN is passed as input to a Label Propagation Algorithm (LPA), which efficiently propagates label information through the global graph structure, enhancing the classification generalization capability. Finally, the model parameters are optimized by calculating the cross-entropy loss between the predicted labels and the true labels of the query set. Experimental results demonstrate that the proposed method significantly outperforms current mainstream methods, with improved classification accuracy on the miniImageNet dataset for both 5-way 1-shot and 5-way 5-shot tasks. This method effectively addresses the challenges of feature insufficiency, imprecise relational modeling, and limited generalization capability in few-shot learning, providing an efficient and robust solution for few-shot classification tasks.
文章引用:李广生, 李烨. 基于Swin Transformer和双聚焦相似度的图卷积标签传播网络小样本分类[J]. 建模与仿真, 2025, 14(5): 488-502. https://doi.org/10.12677/mos.2025.145409

参考文献

[1] 赵晓, 张懿丹, 章为川, 等. 基于通道先验感知的多尺度细化小样本细粒度图像分类[J]. 陕西科技大学学报, 2025: 1-11.
[2] 张浩, 曹磊, 马利亚. 基于交叉协同注意力网络的小样本肠道息肉图像语义分割[J]. 中国数字医学, 2025, 20(1): 39-44.
[3] 王新广, 李辉. 结合Swin Transformer与MobileNetv3的多源无人机影像目标检测方法[J]. 城市勘测, 2025(1): 27-32.
[4] 周峻宇, 施水才, 王洪俊. 基于深度学习的图像字幕生成综述[J]. 软件导刊, 2025, 24(1): 211-220.
[5] Chopra, S., Hadsell, R. and LeCun, Y. (2005) Learning a Similarity Metric Discriminatively, with Application to Face Verification. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1, 539-546. [Google Scholar] [CrossRef
[6] Vinyals, O., Blundell, C., Lillicrap, T., et al. (2016) Matching Networks for One Shot Learning. Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, 5-10 December 2016, 3637-3645.
[7] Snell, J., Swersky, K. and Zemel, R. (2017) Prototypical Networks for Few-Shot Learning. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, 4-9 December 2017, 4080-4090.
[8] Luo, Y., Huang, Z., Zhang, Z., Wang, Z., Baktashmotlagh, M. and Yang, Y. (2020) Learning from the Past: Continual Meta-Learning with Bayesian Graph Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 5021-5028. [Google Scholar] [CrossRef
[9] Liu, Y., Lee, J., Park, M., et al. (2018) Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning.
[10] Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S. and Hospedales, T.M. (2018) Learning to Compare: Relation Network for Few-Shot Learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-22 June 2018, 1199-1208. [Google Scholar] [CrossRef
[11] Satorras, V.G. and Estrach, J.B. (2018) Few-Shot Learning with Graph Neural Networks. International Conference on Learning Representations, Vancouver, 30 April-3 May 2018.
[12] Gidaris, S. and Komodakis, N. (2019) Generating Classification Weights with GNN Denoising Autoencoders for Few-Shot Learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 21-30. [Google Scholar] [CrossRef
[13] Rodríguez, P., Laradji, I., Drouin, A. and Lacoste, A. (2020) Embedding Propagation: Smoother Manifold for Few-Shot Classification. Computer VisionECCV 2020 16th European Conference, Glasgow, 23-28 August 2020, 121-138. [Google Scholar] [CrossRef
[14] Ye, H., Hu, H., Zhan, D. and Sha, F. (2020) Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 8808-8817. [Google Scholar] [CrossRef
[15] Li, X., Sun, Q., Liu, Y., et al. (2019) Learning to Self-Train for Semi-Supervised Few-Shot Classification. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, 8-14 December 2019, 10276-10286.
[16] Triantafillou, E., Larochelle, H., Snell, J., et al. (2018) Meta-Learning for Semi-Supervised Few-Shot Classification. International Conference on Learning Representations, Vancouver, 30 April-3 May 2018, 10276-10286.
[17] Simon, C., Koniusz, P., Nock, R. and Harandi, M. (2020) Adaptive Subspaces for Few-Shot Learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 4136-4145. [Google Scholar] [CrossRef
[18] Saito, K., Kim, D., Sclaroff, S., Darrell, T. and Saenko, K. (2019) Semi-Supervised Domain Adaptation via Minimax Entropy. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 8050-8058. [Google Scholar] [CrossRef
[19] Kim, J., Kim, T., Kim, S. and Yoo, C.D. (2019) Edge-Labeling Graph Neural Network for Few-Shot Learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 11-20. [Google Scholar] [CrossRef
[20] Liu, J., Song, L. and Qin, Y. (2020) Prototype Rectification for Few-Shot Learning. Computer VisionECCV 2020 16th European Conference, Glasgow, 23-28 August 2020, 741-756. [Google Scholar] [CrossRef
[21] Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al. (2021) Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 11-17 October 2021, 10012-10022. [Google Scholar] [CrossRef
[22] Yu, T., He, S., Song, Y. and Xiang, T. (2022) Hybrid Graph Neural Networks for Few-Shot Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 3179-3187. [Google Scholar] [CrossRef
[23] Finn, C., Abbeel, P. and Levine, S. (2017) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. International Conference on Machine Learning, Sydney, 6-11 August 2017, 1126-1135.
[24] Zhang, R., Che, T., Ghahramani, Z., et al. (2018) Metagan: An Adversarial Approach to Few-Shot Learning. Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, Montreal, 3-8 December 2018, 2365-2374.
[25] Mishra, N., Rohaninejad, M., Chen, X., et al. (2017) A Simple Neural Attentive Meta-Learner.
[26] Sun, Q., Liu, Y., Chua, T. and Schiele, B. (2019) Meta-Transfer Learning for Few-Shot Learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 403-412. [Google Scholar] [CrossRef
[27] Yoon, S.W., Seo, J. and Moon, J. (2019) Tapnet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning. International Conference on Machine Learning, Long Beach, 9-15 June 2019, 7115-7123.
[28] Chen, W.Y., Liu, Y.C., Kira, Z., et al. (2019) A Closer Look at Few-Shot Classification.
[29] Ye, H.J., Hu, H., Zhan, D.C., et al. (2018) Learning Embedding Adaptation for Few-Shot Learning.
[30] Liu, Y., Schiele, B. and Sun, Q. (2020) An Ensemble of Epoch-Wise Empirical Bayes for Few-Shot Learning. Computer VisionECCV 2020 16th European Conference, Glasgow, 23-28 August 2020, 404-421. [Google Scholar] [CrossRef
[31] Lee, K., Maji, S., Ravichan-dran, A. and Soatto, S. (2019) Meta-Learning with Differentiable Convex Optimization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 10657-10665.
[32] Yang, L., Li, L., Zhang, Z., Zhou, X., Zhou, E. and Liu, Y. (2020) DPGN: Distribution Propagation Graph Network for Few-Shot Learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 13390-13399. [Google Scholar] [CrossRef