基于MFS的图像分割边界优化策略
Boundary Optimization Strategy for Image Segmentation Based on MFS
摘要: 基础视觉大模型(例如SAM)极大地推动了自然场景下的计算机视觉任务,然而其分割掩码的锯齿状边缘缺陷,影响了分割的质量。传统边界细化技术通常通过在原网络架构中附加参与训练的额外分支来解决此问题,不可避免地带来了庞大的训练开销。针对这一痛点,本文提出了一种专门适配SAM模型的低成本边界细化模块,该模块基于基本解隐式方法构建。具体而言,我们对SAM的初始分割边界点进行重采样,并执行隐式重构,从而在不改变目标原始拓扑形状的情况下显著平滑边界。基于COCO数据集的评估结果证实,该优化策略在保留SAM分割精度的同时,使曲率指标提升了接近十倍。该方法为自然图像的高级视觉处理提供了一种高效、稳健且无需昂贵训练成本的边界优化方案。
Abstract: SAM has significantly advanced computer vision tasks in natural scenes. However, the jagged edges of their segmentation masks often compromise the overall quality. Traditional boundary refinement techniques typically address this by appending additional trainable branches to the original architecture, which inevitably incurs substantial training overhead. To address this pain point, this paper proposes a low-cost boundary refinement module specifically tailored for SAM, built upon the implicit method of fundamental solutions. Specifically, we resample the initial boundary points from SAM’s segmentation and perform global implicit reconstruction, significantly smoothing the boundaries while preserving the object’s original topology. Evaluation results on the COCO dataset demonstrate that this optimization strategy improves curvature metrics nearly tenfold while maintaining SAM’s full segmentation accuracy. This approach provides an efficient, robust, and training-free boundary optimization solution for advanced visual processing in natural imagery.
文章引用:王宇帅, 雷敏. 基于MFS的图像分割边界优化策略[J]. 应用数学进展, 2026, 15(4): 493-505. https://doi.org/10.12677/aam.2026.154177

参考文献

[1] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Lecture Notes in Computer Science, Springer, 234-241. [Google Scholar] [CrossRef
[2] Badrinarayanan, V., Kendall, A. and Cipolla, R. (2016) SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495. [Google Scholar] [CrossRef] [PubMed]
[3] Chen, L., Zhu, Y., Papandreou, G., Schroff, F. and Adam, H. (2018) Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In: Lecture Notes in Computer Science, Springer, 833-851. [Google Scholar] [CrossRef
[4] He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2018) Mask R-CNN. [Google Scholar] [CrossRef
[5] Ren, S., He, K., Girshick, R. and Sun, J. (2016) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. [Google Scholar] [CrossRef
[6] Wang, J., Sun, K., Cheng, T., Jiang, B., et al. (2020) Deep High-Resolution Representation Learning for Visual Recognition. [Google Scholar] [CrossRef
[7] Vaswani, A., Shazeer, N., Parmar, N., et al. (2023) Attention Is All You Need.
https://arxiv.org/abs/1706.03762
[8] Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M. and Luo, P. (2021) SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers.
https://arxiv.org/abs/2105.15203
[9] Cheng, B., Misra, I., Schwing, A.G., Kirillov, A. and Girdhar, R. (2022) Masked-Attention Mask Transformer for Universal Image Segmentation. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 1280-1289. [Google Scholar] [CrossRef
[10] Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., et al. (2023) Segment Anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 3992-4003. [Google Scholar] [CrossRef
[11] Kirillov, A., Wu, Y., He, K. and Girshick, R. (2020) PointRend: Image Segmentation as Rendering. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 9796-9805. [Google Scholar] [CrossRef
[12] Yuan, Y., Xie, J., Chen, X. and Wang, J. (2020) SegFix: Model-Agnostic Boundary Refinement for Segmentation. In: Lecture Notes in Computer Science, Springer, 489-506. [Google Scholar] [CrossRef
[13] Takikawa, T., Acuna, D., Jampani, V. and Fidler, S. (2019) Gated-SCNN: Gated Shape CNNs for Semantic Segmentation. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 5228-5237. [Google Scholar] [CrossRef
[14] Cheng, T.H., Wang, X.G., Huang, L.C., et al. (2020) Boundary-Preserving Mask R-CNN. In: Vedaldi, A., Bischof, H., Brox, T., et al., Eds., Computer VisionECCV 2020. Springer International Publishing, Cham, 660-676.
[15] Zhang, G., Lu, X., Tan, J., Li, J., Zhang, Z., Li, Q., et al. (2021) RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 6857-6865. [Google Scholar] [CrossRef
[16] Cheng, H.K., Chung, J., Tai, Y. and Tang, C. (2020) CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 8887-8896. [Google Scholar] [CrossRef
[17] Tankelevich, R., Fairweather, G. and Karageorghis, A. (2009) Three-Dimensional Image Reconstruction Using the PF/MFS Technique. Engineering Analysis with Boundary Elements, 33, 1403-1410. [Google Scholar] [CrossRef
[18] Chen, C.S., Amuzu, L., Acheampong, K. and Zhu, H. (2021) Improved Geometric Modeling Using the Method of Fundamental Solutions. Engineering Analysis with Boundary Elements, 130, 49-57. [Google Scholar] [CrossRef
[19] Lei, M., Liu, L., Chen, C.S. and Zhao, W. (2023) The Enhanced Boundary Knot Method with Fictitious Sources for Solving Helmholtz-Type Equations. International Journal of Computer Mathematics, 100, 1500-1511. [Google Scholar] [CrossRef
[20] Belyaev, A.G. (1999) A Note on Invariant Three-Point Curvature Approximations.
https://www.kurims.kyoto-u.ac.jp/~kyodo/kokyuroku/contents/pdf/1111-16.pdf