基于改进Swin Transformer的交互式灰度图像着色方法的研究
Research on Interactive Grayscale Image Coloring Method Based on Improved Swin Transformer
摘要: 交互图像着色的目的是在用户提供特定位置的颜色时对灰度图像进行着色。我们使用了改进的Swin Transformer,局部稳定层和像素重组的高效方案。该模型利用Swin Transformer的层次化结构和偏移窗口机制,相比Transformer能够实现在保持计算效率的同时,增强模型对多尺度特征的捕获能力,从而更好地传播和融合用户提供的颜色提示。此外我们改进Swin Transformer从而提高计算效率和模型的可训练性。后面通过局部稳定层去解决像素重组中的局部伪影和边界模糊的问题,最后使用了像素重组取缔解码器来实现高效的图像着色。经实验结果表明,我们的方法优于现有的交互着色方法,可以在用户的提示下产生准确而真实的着色图像。
Abstract: The purpose of interactive image coloring is to color a grayscale image based on the colors specified by the user at specific locations. We utilize an improved Swin Transformer, combined with a local stability layer and an efficient pixel shuffling scheme. This model leverages the hierarchical structure and shifted window mechanism of the Swin Transformer, which enhances the model’s ability to capture multi-scale features while maintaining computational efficiency, compared to the traditional Transformer. This leads to better propagation and integration of user-provided color hints. Additionally, we have optimized the Swin Transformer to improve computational efficiency and model trainability. The local stability layer addresses local artifacts and boundary blurring issues during pixel shuffling, and the pixel shuffling technique replaces the decoder to achieve efficient image colorization. Experimental results show that our method outperforms existing interactive colorization methods, producing accurate and realistic colorized images based on user hints.
文章引用:何荣军, 何利文. 基于改进Swin Transformer的交互式灰度图像着色方法的研究[J]. 软件工程与应用, 2024, 13(5): 629-636. https://doi.org/10.12677/sea.2024.135064

参考文献

[1] Iizuka, S., Simo-Serra, E. and Ishikawa, H. (2016) Let There Be Color! Joint End-to-End Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification. ACM Transactions on Graphics, 35, 1-11. [Google Scholar] [CrossRef
[2] Patricia, V., Lara, R. and Coloma, B. (2020) ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution. Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, 1-5 March 2020, Colorado, 2434-2443.
[3] 王钰. 基于深度学习的灰度图像着色算法研究[D]: [硕士学位论文]. 武汉: 湖北大学, 2023.
[4] Dong, X., Li, W. and Wang, X. (2021) Pyramid Convolutional Network for Colorization in Monochrome-Color Multi-Lens Camera System. Neurocomputing, 450, 129-142. [Google Scholar] [CrossRef
[5] Xu, Z., Wang, T., Fang, F., Sheng, Y. and Zhang, G. (2020) Stylization-Based Architecture for Fast Deep Exemplar Colorization. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 9360-9369. [Google Scholar] [CrossRef
[6] Yin, H., Gong, Y. and Qiu, G. (2019) Side Window Filtering. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 8758-8766. [Google Scholar] [CrossRef
[7] Zhang, R., Zhu, J., Isola, P., Geng, X., Lin, A.S., Yu, T., et al. (2017) Real-Time User-Guided Image Colorization with Learned Deep Priors. ACM Transactions on Graphics, 36, 1-11. [Google Scholar] [CrossRef
[8] Yun, J., Lee, S., Park, M. and Choo, J. (2023) iColoriT: Towards Propagating Local Hints to the Right Region in Interactive Colorization by Leveraging Vision Transformer. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, 2-7 January 2023, 1787-1796. [Google Scholar] [CrossRef
[9] Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al. (2021) Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 11-17 October 2021, 10012-10022. [Google Scholar] [CrossRef
[10] Shi, W.Z., Caballero, J., Husz’ar, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D. and Wang, Z.H. (2016) Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. 2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2016, 1874-1883.
[11] Huber, P.J. (1992) Robust Estimation of a Location Parameter. In: Kotz, S. and Johnson, N.L., Eds., Breakthroughs in Statistics, Springer, 492-518. [Google Scholar] [CrossRef
[12] Nacson, M.S., Srebro, N. and Soudry, D. (2018) Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate. arXiv: 1806.01796.
[13] Loshchilov, I. and Hutter, F. (2016) SGDR: Stochastic Gradient Descent with Warm Restarts. arXiv: 1608.03983.
[14] Jiang, G. (2023) Security Detection Design for Laboratory Networks Based on Enhanced LSTM and AdamW Algorithms. International Journal of Information Technologies and Systems Approach, 16, 1-13. [Google Scholar] [CrossRef