基于空间注意力机制的快速生成对抗网络
Rapid Generation of Adversarial Networks Based on Spatial Attention Mechanism
摘要: 针对实际应用中数据量不足导致的图像生成质量低下以及模型收敛缓慢的问题,提出了一种基于注意力机制的快速生成对抗网络模型。模型引入简化的注意力机制模块后能够获取图像的全局信息。模型由带有注意力机制的残差模块配合跳层连接机制为主。同时,对抗学习使得模型生成的图像更加逼真。为了加快收敛速度,元数据作为模型的输入为模型提供基本图像的基本信息。实验结果表明,对比现有的模型,该模型在小数据集上不仅收敛速度更快,成像也更加逼真。在数据量足够的数据集上对比现有的模,能够生成质量非常接近的图像,并且大大缩短了收敛的时间,减少了占用的内存。
Abstract: Aiming at the problems of low image generation quality and slow model convergence caused by insufficient data in practical applications, a fast generation adversarial network model based on attention mechanism is proposed. The model can obtain the global information of the image by introducing the simplified attention mechanism module. The model is mainly composed of a residual module with an attention mechanism and a layer-jumping connection mechanism. At the same time, adversarial learning makes the image generated by the model more realistic. In order to speed up the convergence, metadata as the input of the model provides the basic information of the basic image for the model. Experimental results show that the proposed model not only converges faster on small data sets, but also has more realistic imaging. Compared with the existing modules on the data set with sufficient data, this model can generate images of very similar quality, and greatly shorten the time of convergence and reduce the memory consumption.
文章引用:吴天宝, 徐芳. 基于空间注意力机制的快速生成对抗网络[J]. 计算机科学与应用, 2023, 13(2): 180-190. https://doi.org/10.12677/CSA.2023.132019

参考文献

[1] Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Nets. Proceedings of the 27th Inter-national Conference on Neural Information Processing Systems, Volume 2, 2672-2680.
[2] Karras, T., Aittala, M., Laine, S., Härkönen, E., Hellsten, J., Lehtinen, J. and Aila, T. (2021) Alias-Free Generative Adversarial Networks. Ad-vances in Neural Information Processing Systems, 34, 214-233.
[3] Wang, Z.W., She, Q. and Ward, T.E. (2021) Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy. ACM Computing Surveys (CSUR), 54, 1-38. [Google Scholar] [CrossRef
[4] Wang, L., Ho, Y.-S. and Yoon, K.-J. (2019) Event-Based High Dy-namic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 16-20 June 2019, 10081-10090. [Google Scholar] [CrossRef
[5] Tov, O., Alaluf, Y., Nitzan, Y., et al. (2021) Designing an Encoder for Stylegan Image Manipulation. ACM Transactions on Graphics, 40, 1-14. [Google Scholar] [CrossRef
[6] Liu, H.Y., Wan, Z.Y., Huang, W., et al. (2021) Pd-gan: Probabilis-tic Diverse Gan for Image Inpainting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog-nition, Nashville, 19-25 June 2021, 9371-9381. [Google Scholar] [CrossRef
[7] Jiang, L.M., Dai, B., Wu, W. and Loy, C.C. (2021) Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data. 35th Conference on Neural Information Pro-cessing Systems (NeurIPS 2021), 6-14 December 2021, 153-171.
[8] Xiao, T., Xu, Y., Yang, K., et al. (2015) The Application of Two-Level Attention Models in Deep Convolutional Neural Network for Fine-Grained Image Classifica-tion. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 842-850.
[9] Jabbar, A., Li, X. and Omar, B. (2021) A survey on Generative Adversarial Networks: Variants, Appli-cations, and Training. ACM Computing Surveys, 54, 1-49. [Google Scholar] [CrossRef
[10] Arjovsky, M., Chintala, S. and Bottou, L. (2017) Wasserstein Generative Adversarial Networks. International Conference on Machine Learning, Volume 70, 214-223.
[11] Nowozin, S., Cseke, B. and Tomioka, R. (2016) F-GAN: Training Generative Neural Samplers Using Variational Divergence Minimization. 30th Conference on Neural Information Processing Sys-tems (NIPS 2016), Barcelona, 5-10 December 2016, 432-654.
[12] Donahue, J. and Simonyan, K. (2019) Large Scale Adversarial Representation Learning. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, 8-14 December 2019, 12-65.
[13] Karras, T., Aila, T., Laine, S. and Lehtinen, J. (2018) Progressive Grow-ing of GANs for Improved Quality, Stability, and Variation. International Conference on Learning Representations, Vancouver, 30 April-3 May 2018, 254-337.
[14] Karras, T., Laine, S. and Aila, T. (2019) A Style-Based Generator Ar-chitecture for Generative Adversarial Networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pat-tern Recognition, Long Beach, 16-20 June 2019, 4401-4410. [Google Scholar] [CrossRef
[15] Zhang, H., Goodfellow, I., Metaxas, D. and Odena, A. (2019) Self-Attention Generative Adversarial Networks. Proceedings of the International Conference on Machine Learning, Long Beach, 9-15 June 2019, 7354-7363.
[16] Karras, T., Laine, S., Aittala, M., et al. (2020) Analyzing and Improving the Image Quality of Stylegan. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 13-19 June 2020, 8110-8119. [Google Scholar] [CrossRef
[17] Zhao, H.Y., Liu, Z.J., Lin, J., et al. (2020) Differentiable Augmentation for Data Efficient Gan Training. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, 568-572.
[18] Yang, C.Y., Shen, Y.J., Xu, Y.H. and Zhou, B.L. (2021) Data-Efficient Instance Generation from Instance Discrimination. 35th Conference on Neural Information Processing Systems (NeurIPS 2021) 6-14 December 2021, 9378-9390.
[19] Radford, A., Metz, L. and Chintala, S. (2015) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. 4th International Conference on Learning Repre-sentations, ICLR 2016, San Juan, 2-4 May 2016, 689-956.
[20] Tseng, H.-Y., Jiang, L., Liu, C., et al. (2021) Regular-izing Generative Adversarial Networks under Limited Data. Proceedings of the IEEE/CVF Conference on Computer Vi-sion and Pattern Recognition, Nashville, 19-25 June 2021, 7921-7931. [Google Scholar] [CrossRef
[21] Zhang, H., Xu, T., Li, H., et al. (2017) StackGAN: Text to Photorealistic Image Synthesis with Stacked Generative Adversarial Networks. IEEE International Conference on Com-puter Vision, ICCV 2017, Venice, 22-29 October 2017, 5907-5915. [Google Scholar] [CrossRef
[22] Liu, B.C., Zhu, Y.Z., Song, K.P. and Elgammal, A. (2021) Towards Faster and Stabilized GAN Training for High-Fidelity Few-Shot Image Synthesis. International Conference on Learning Representations, ICLR 2021, Vienna, 4 May 2021, 135-983.