基于自引导原型增强的小样本语义分割方法——一种改进的基学习与元学习分割方法
A Few-Shot Semantic Segmentation Method Based on Self Guided Prototype Enhancement—An Improved Segmentation Method of Base Learning and Meta Learning
DOI: 10.12677/csa.2024.145126, PDF,   
作者: 陈 涵:合肥工业大学计算机与信息学院,安徽 合肥
关键词: 小样本语义分割原型学习特征增强Few-Shot Semantic Segmentation Prototypes Feature Enhancement
摘要: 近期,小样本语义分割研究备受关注并取得了显著进展。先前的方法主要依赖于分类任务的元学习框架以实现泛化能力,然而这种训练方式往往导致模型对已见类别存在偏见,未能达到理想的类别无关性。最新的研究中,一种名为基学习与元学习的方法被提出,以识别基础类别目标并有效地区分背景部分。然而,该方法在强调对背景特征的识别时,忽略了前景特征的增强。因此,我们对该方法进行了进一步改进。我们引入了一种自引导原型学习的方法,通过生成辅助原型并用它生成激活特征图,从而增强原型特征,有效促进模型对前景特征的识别。在PASCAL-5i数据集上的实验结果表明,我们提出的方法在1-shot和5-shot情况下的mIoU分别达到了68.01和71.12,证明了该方法能够有效提升小样本语义分割的精确度。
Abstract: Recently, the research on few-shot semantic segmentation frameworks has gained significant attention and made notable progress. Previous methods mainly relied on meta-learning frameworks of classification tasks to achieve generalization, but this training approach often resulted in biases towards seen classes and failed to achieve ideal class-agnosticism. In the latest research, a method based on meta-learner was proposed to identify base class objects and effectively differentiate background regions. However, this method overlooked the enhancement of foreground features while emphasizing the recognition of background features. Therefore, we further improved this method. We introduced a self-guided approach that enhanced prototype features by generating auxiliary prototypes and using them to generate activation feature maps, thus effectively promoting the recognition of foreground features. Experimental results on the PASCAL-5i dataset showed that our proposed method achieved mIoU of 68.01 and 71.12 in 1-shot and 5-shot scenarios, respectively, demonstrating the effectiveness of the method in improving the accuracy of few-shot semantic segmentation.
文章引用:陈涵. 基于自引导原型增强的小样本语义分割方法——一种改进的基学习与元学习分割方法[J]. 计算机科学与应用, 2024, 14(5): 172-183. https://doi.org/10.12677/csa.2024.145126

参考文献

[1] Dong, G., Yan, Y., Shen, C., et al. (2020) Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes. IEEE Transactions on Intelligent Transportation Systems, 22, 3258-3274. [Google Scholar] [CrossRef
[2] Zhang, J., Xie, Y., Xia, Y., et al. (2021) Dodnet: Learning to Segment Multi-Organ and Tumors from Multiple Partially Labeled Datasets. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Nashville, 20-25 June 2021, 1195-1204. [Google Scholar] [CrossRef
[3] Badrinarayanan, V., Kendall, A. and Cipolla, R. (2017) Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495. [Google Scholar] [CrossRef
[4] Lin, D., Dai, J., Jia, J., et al. (2016) Scribblesup: Scribble-Supervised Convolutional Networks for Semantic Segmentation. 2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 3159-3167. [Google Scholar] [CrossRef
[5] Sun, W. and Wang, R. (2018) Fully Convolutional Networks for Semantic Segmentation of Very High Resolution Remotely Sensed Images Combined with DSM. IEEE Geoscience and Remote Sensing Letters, 15, 474-478. [Google Scholar] [CrossRef
[6] Peng, C., Zhang, X., Yu, G., et al. (2017) Large Kernel Matters—Improve Semantic Segmentation by Global Convolutional Network. 2017 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 1743-1751. [Google Scholar] [CrossRef
[7] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015, Munich, 5-9 October 2015, 234-241. [Google Scholar] [CrossRef
[8] Shaban, A., Bansal, S., Liu, Z., et al. (2017) One-Shot Learning for Semantic Segmentation. arXiv:1709.03410. [Google Scholar] [CrossRef
[9] Vinyals, O., Blundell, C., Lillicrap, T., et al. (2016) Matching Networks for One Shot Learning. Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, 5-10 December 2016, 3637-3645.
[10] Huisman, M., Van Rijn, J.N. and Plaat, A. (2021) A Survey of Deep Meta-Learning. Artificial Intelligence Review, 54, 4483-4541. [Google Scholar] [CrossRef
[11] Fan, Z., Ma, Y., Li, Z., et al. (2021) Generalized Few-Shot Object Detection without Forgetting. 2021 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 20-25 June 2021, 4525-4534. [Google Scholar] [CrossRef
[12] Lang, C., Cheng, G., Tu, B., et al. (2022) Learning What not to Segment: A New Perspective on Few-Shot Segmentation. 2022 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 8047-8057. [Google Scholar] [CrossRef
[13] Zhang, C., Lin, G., Liu, F., et al. (2019) CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning. 2019 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 5212-5221. [Google Scholar] [CrossRef
[14] Tian, Z., Zhao, H., Shu, M., et al. (2020) Prior Guided Feature Enrichment Network for Few-Shot Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 1050-1065. [Google Scholar] [CrossRef
[15] Zhang, X., Wei, Y., Yang, Y., et al. (2020) SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. IEEE Transactions on Cybernetics, 50, 3855-3865. [Google Scholar] [CrossRef
[16] Zhang, B., Xiao, J. and Qin, T. (2021) Self-Guided and Cross-Guided Learning for Few-Shot Segmentation. 2021 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 20-25 June 2021, 8308-8317. [Google Scholar] [CrossRef
[17] Li, H., Eigen, D., Dodge, S., et al. (2019) Finding Task-Relevant Features for Few-Shot Learning by Category Traversal. 2019 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 1-10. [Google Scholar] [CrossRef
[18] Sung, F., Yang, Y., Zhang, L., et al. (2018) Learning to Compare: Relation Network for Few-Shot Learning. 2018 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 1199-1208. [Google Scholar] [CrossRef
[19] Finn, C., Abbeel, P. and Levine, S. (2017) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Proceedings of the 34th International Conference on Machine Learning-Volume 70, Sydney, 6-11 August 2017, 1126-1135.
[20] Ravi, S. and Larochelle, H. (2016) Optimization as a Model for Few-Shot Learning. ICLR 2017 Conference Track 5th International Conference on Learning Representations, Toulon, 24-26 April 2017.
[21] Chen, Z., Fu, Y., Chen, K., et al. (2019) Image Block Augmentation for One-Shot Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 3379-3386. [Google Scholar] [CrossRef
[22] Chen, Z., Fu, Y., Wang, Y.X., et al. (2019) Image Deformation Meta-Networks for One-Shot Learning. 2019 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 8672-8681. [Google Scholar] [CrossRef
[23] Min, J., Kang, D. and Cho, M. (2021) Hypercorrelation Squeeze for Few-Shot Segmentation. 2021 Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, 10-17 October 2021, 6921-6932. [Google Scholar] [CrossRef
[24] Zhang, C., Lin, G., Liu, F., et al. (2019) Pyramid Graph Networks with Connection Attentions for Region-Based One-Shot Semantic Segmentation. 2019 Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, 27 October-02 November 2019, 9586-9594. [Google Scholar] [CrossRef
[25] Lang, C., Tu, B., Cheng, G., et al. (2022) Beyond the Prototype: Divide-and-Conquer Proxies for Few-Shot Segmentation. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Messe Wien, 23-29 July 2022, 1024-1030. [Google Scholar] [CrossRef
[26] He, K., Zhang, X., Ren, S., et al. (2016) Deep Residual Learning for Image Recognition. 2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[27] Chen, L.C., Papandreou, G., Kokkinos, I., et al. (2017) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834-848. [Google Scholar] [CrossRef
[28] Nair, V. and Hinton, G.E. (2010) Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on International Conference on Machine Learning, Haifa, 21-24 June 2010, 807-814.
[29] Hinton, G.E., Srivastava, N., Krizhevsky, A., et al. (2012) Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors. arXiv:1207.0580. [Google Scholar] [CrossRef
[30] Everingham, M., Van Gool, L., Williams, C.K.I., et al. (2010) The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, 88, 303-338. [Google Scholar] [CrossRef
[31] Hariharan, B., Arbeláez, P., Bourdev, L., et al. (2011) Semantic Contours from Inverse Detectors. 2011 International Conference on Computer Vision, Barcelona, 6-13 November 2011, 991-998. [Google Scholar] [CrossRef
[32] Liu, W., Zhang, C., Lin, G., et al. (2020) CRNet: Cross-Reference Networks for Few-Shot Segmentation. 2020 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 13-19 June 2020, 4164-4172. [Google Scholar] [CrossRef
[33] Liu, Y., Zhang, X., Zhang, S., et al. (2020) Part-Aware Prototype Network for Few-Shot Semantic Segmentation. Computer Vision-ECCV 2020, Glasgow, 23-28 August 2020, 142-158. [Google Scholar] [CrossRef