基于自适应归一化的语义图像合成
Semantic Image Synthesis with Adaptive Normalization
DOI: 10.12677/CSA.2022.124095, PDF,   
作者: 徐文锐, 谭台哲:广东工业大学,计算机学院,广东 广州
关键词: 生成对抗网络图像合成风格转换 Generative Adversarial Networks Image Synthesis Style Transfer
摘要: 本文提出了一种可以实现风格控制的自适应归一化。它是一个简单但有效的模块,应用于以分割掩膜为条件的生成对抗网络。以前的方法将风格图像作为输入,输入到深度网络中。本文方法通过在归一化层输入风格信息来学习参数,以此调节归一化层的激活。本文在两个数据集上进行实验,并展示了部分结果。结果表明,本文方法可以根据语义分割掩膜合成符合语义布局和视觉逼真度高的图像,并以同一的模型实现不同风格的转换。
Abstract: This paper presents an adaptive normalization method which can realize style control. It is a simple but effective module, which is applied to generate countermeasure network under the condition of segmented mask. The traditional method takes the style image as the input and inputs it into the depth network. In this paper, the method learns parameters by inputting style information in the normalization layer, so as to adjust the activation of the normalization layer. In this paper, experiments are carried out on two data sets, and some results are shown. The results show that this method can synthesize images with high semantic layout and visual fidelity according to the semantic segmentation mask, and realize the transformation of different styles with the same model.
文章引用:徐文锐, 谭台哲. 基于自适应归一化的语义图像合成[J]. 计算机科学与应用, 2022, 12(4): 934-940. https://doi.org/10.12677/CSA.2022.124095

参考文献

[1] Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Networks. Advances in Neural In-formation Processing Systems, 3, 2672-2680.
[2] Isola, P., Zhu, J.Y., Zhou, T., et al. (2016) Image-to-Image Transla-tion with Conditional Adversarial Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 5967-5976. [Google Scholar] [CrossRef
[3] Wang, T.C., Liu, M.Y., Zhu, J.Y., et al. (2017) High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8798-8807. [Google Scholar] [CrossRef
[4] Park, T., Liu, M.Y., Wang, T.C., et al. (2019) Semantic Image Synthesis with Spatially-Adaptive Normalization. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recogni-tion (CVPR), Long Beach, 15-20 June 2019, 2332-2341. [Google Scholar] [CrossRef
[5] Kingma, D.P. and Welling, M. (2014) Auto-Encoding Variational Bayes. arXiv:1312.6114.
[6] Karras, T., Laine, S. and Aila, T. (2019) A Style-Based Generator Architecture for Gen-erative Adversarial Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 4396-4405. [Google Scholar] [CrossRef
[7] Lee, C.H., Liu, Z., Wu, L., et al. (2019) MaskGAN: Towards Diverse and Interactive Facial Image Manipulation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 5548-5557. [Google Scholar] [CrossRef
[8] Qi, X., Chen, Q., Jia, J., et al. (2018) Semi-Parametric Im-age Synthesis. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8808-8816. [Google Scholar] [CrossRef
[9] Gatys, L.A., Ecker, A.S. and Bethge, M. (2015) A Neural Algo-rithm of Artistic Style. Journal of Vision, 16, 326. [Google Scholar] [CrossRef
[10] Johnson, J., Alahi, A. and Li, F.F. (2016) Perceptual Losses for Real-Time Style Transfer and Super-Resolution. European Conference on Computer Vision, Amsterdam, 11-14 October 2016, 694-711. [Google Scholar] [CrossRef
[11] Mescheder, L., Geiger, A. and Nowozin, S. (2018) Which Training Methods for GANs Do Actually Converge? arXiv:1801.04406.
[12] Miyato, T. and Koyama, M. (2018) cGANs with Projection Discriminator. arXiv:1802.05637.
[13] Lim, J.H. and Ye, J.C. (2017) Geometric GAN. arXiv:1705.02894.
[14] Miyato, T., Kataoka, T., Koyama, M., et al. (2018) Spectral Normalization for Generative Ad-versarial Networks. arXiv:1802.05957.
[15] Zhang, H., Goodfellow, I., Metaxas, D., et al. (2018) Self-Attention Gen-erative Adversarial Networks. arXiv:1805.08318.