人脸图像性别转移鲁棒模型研究
A Robust Model of Gender Transfer in Facial Images
DOI: 10.12677/CSA.2023.132020, PDF,    科研立项经费支持
作者: 卢 维, 何 强:北京建筑大学,理学院,北京;北京建筑大学,大数据建模理论与技术研究所,北京
关键词: 生成对抗网络人脸解析直方图匹配无监督样式迁移人脸性别转换Generate Adversarial Networks Face Parsing Histogram Matching Unsupervised Style Transfer Face Gender Transfer
摘要: 人脸图像性别转移属于图像风格迁移问题的特例,运用一般的生成对抗网络模型往往不能对人脸部分进行高质量迁移,且无关背景域常常出现扭曲模糊现象,人脸肤色也不能保持原颜色。针对上述问题,本文在基于改进MUNIT的人脸图像性别转换模型的基础上,提出具有鲁棒性质的人脸图像性别转移模型。首先对输入模型的人脸图像进行人脸解析(Face Parsing),准确将图像中的人脸部分输入到模型中进行训练学习,以解决图像中无关背景域对模型训练的影响;其次构造新的损失函数,将模型生成前后的人脸部分做基于颜色的直方图匹配(Histogram Matching),从而将人脸性别转移前后的肤色保持一致;最后对公开人脸数据集CeleBA进行属性筛选,以减少人脸遮挡,眼镜等影响模型训练的不利因素,从而提高生成图像的质量。实验结果表明,与其他经典算法相比,本文所提方法可以有效保留图像背景区域以及人脸肤色,并生成效果更好的人脸性别转移图像。
Abstract: Gender transfer of face image is a special case of image style transfer problem. The use of the gen-eral generative adversity-network model often cannot transfer the face part of high quality, and the irrelevant background domain often appears distorted and fuzzy, and the face skin color can not maintain the original appearance. To solve these problems, this paper proposes a robust face image gender transfer model based on MUNIT’s improved face image gender transfer model. Firstly, Face Parsing was performed on the face image input to the model, and the face part of the image was accurately input to the model for training and learning, so as to solve the influence of the irrelevant background domain on the model training. Secondly, a new loss function was constructed to perform color based on Histogram Matching on faces before and after the generation of the model, so as to ensure the consistency of skin color before and after face gender transfer. Finally, attribute screening was carried out on the public face data set CeleBA to reduce the adverse factors affecting model training such as face occlusion and glasses, so as to improve the quality of the generated images. The experimental results show that, compared with other classical algorithms, the proposed method can effectively preserve the background area of the image and human skin color, and generate better facial gender transfer images.
文章引用:卢维, 何强. 人脸图像性别转移鲁棒模型研究[J]. 计算机科学与应用, 2023, 13(2): 191-203. https://doi.org/10.12677/CSA.2023.132020

参考文献

[1] Schroff, F., Kalenichenko, D. and Philbin, J. (2015) FaceNet: A Unified Embedding for Face Recognition and Clustering. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 815-823. [Google Scholar] [CrossRef
[2] Deng, J., Guo, J., Yang, J., Xue, N., Kotsia, I. and Zafeiriou, S. (2022) ArcFace: Additive Angular Margin Loss for Deep Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 5962-5979. [Google Scholar] [CrossRef
[3] Meng, Q., Zhao, S., Huang, Z. and Zhou, F. (2021) MagFace: A Universal Representation for Face Recognition and Quality Assessment. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 14220-14229. [Google Scholar] [CrossRef
[4] Deng, H., Han, C., Cai, H., Han, G. and He, S. (2021) Spa-tially-Invariant Style-Codes Controlled Makeup Transfer. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 6545-6553. [Google Scholar] [CrossRef
[5] Sun, Z., Chen, Y. and Xiong, S. (2021) SSAT: A Symmet-ric Semantic-Aware Transformer Network for Makeup Transfer and Removal. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 2325-2334. [Google Scholar] [CrossRef
[6] Nguyen, T., Tran, A. and Hoai, M. (2021) Lipstick Ain’t Enough: Beyond Color Matching for In-the-Wild Makeup Transfer. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13300-13309. [Google Scholar] [CrossRef
[7] Sun, J., Wang, X., Zhang, Y., Li, X., Zhang, Q., Liu, Y. and Wang, J. (2021) FENeRF: Face Editing in Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 7662-7672. [Google Scholar] [CrossRef
[8] Otberdout, N., Ferrari, C., Daoudi, M., Berretti, S. and Bimbo, A. (2021) Sparse to Dense Dynamic 3D Facial Expression Generation. 2022 IEEE/CVF Conference on Comput-er Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 20353-20362. [Google Scholar] [CrossRef
[9] Xu, Y., Yin, Y., Jiang, L., Wu, Q., Zheng, C., Loy, C.C., Dai, B. and Wu, W. (2022) TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 19-20 June 2022, 7673-7682. [Google Scholar] [CrossRef
[10] Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C. and Bengio, Y. (2014) Generative Adversarial Nets. Proceedings of the NIPS 2014 Workshop on High-Energy Physics and Machine Learning, Montreal, 13 December 2014, 2672-2680.
[11] Rai, H. and Shukla, N. (2018) Unpaired Image-to-Image Translation Using Cycle-Consistent Adver-sarial Networks. IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22-29 October 2017, 2223-2232.
[12] Chen, X., Xu, C., Yang, X., et al. (2018) Gated-GAN: Adversarial Gated Networks for Mul-ti-Collection Style Transfer. IEEE Transactions on Image Processing, 28, 546-560. [Google Scholar] [CrossRef
[13] Sanakoyeu, A., Kotovenko, D., Lang, S., et al. (2018) A Style-Aware Content Loss for Real-Time HD Style Transfer. European Conference on Computer Vision, Munich, 8-14 September 2018, 698-714. [Google Scholar] [CrossRef
[14] Chen, D., Yuan, L., Liao, J., et al. (2017) Stylebank: An Ex-plicit Representation for Neural Image Style Transfer. Proceedings of the IEEE Conference on Computer Vision and Pat-tern Recognition, Honolulu, 21-26 July 2017, 1897-1906. [Google Scholar] [CrossRef
[15] Liu, H., Michelini, P.N. and Dan, Z. (2018) Artsy-GAN: A Style Transfer System with Improved Quality, Diversity and Performance. 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, 20-24 August 2018, 79-84. [Google Scholar] [CrossRef
[16] Ma, Z., Li, J., Wang, N., et al. (2020) Semantic-Related Image Style Transfer with Dual-Consistency Loss. Neurocomputing, 406, 135-149. [Google Scholar] [CrossRef
[17] Huang, X., Liu, M., Belongie, S.J. and Kautz, J. (2018) Multimodal Unsupervised Image-to-Image Translation. 15th European Conference on Computer Vision, Munich, 8-14 September 2018, 179-196. [Google Scholar] [CrossRef
[18] Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J. and Wei, X. (2021) Rethinking BiSeNet for Real-Time Semantic Segmentation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 19-25 June 2021, 9711-9720. [Google Scholar] [CrossRef
[19] Wilmot, P., Risser, E. and Barnes, C. (2017) Stable and Controllable Neural Texture Synthesis and Style Transfer Using Histogram Losses.
[20] Liu, M., Breuel, T.M. and Kautz, J. (2017) Unsupervised Image-to-Image Translation Networks.
[21] Kim, J., Kim, M., Kang, H. and Lee, K. (2020) U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Im-age-to-Image Translation.
[22] 石达, 芦天亮, 杜彦辉, 等. 基于改进CycleGAN的人脸性别伪造图像生成模型[J]. 计算机科学, 2022, 49(2): 31-39.
[23] Liu, X., Wang, R., Peng, H., et al. (2021) Sparse Feature Representation Learn-ing for Deep Face Gender Transfer. IEEE International Conference on Computer Vision, Montreal, 11-17 October 2021, 4070-4080. [Google Scholar] [CrossRef
[24] Woo, S., Park, J., Lee, J. and Kweon, I. (2018) CBAM: Convolutional Block Attention Module. 15th European Conference on Computer Vision, Munich, 8-14 September 2018, 3-19. [Google Scholar] [CrossRef
[25] Jing, Y., Liu, X., Ding, Y., Wang, X., Ding, E., Song, M. and Wen, S. (2020) Dynamic Instance Normalization for Arbitrary Style Transfer. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 4369-4376. [Google Scholar] [CrossRef
[26] Yu, C.Q., Wang, J.B., Peng, C., Gao, C.X., Yu, G. and Sang, N. (2018) BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation. European Conference on Com-puter Vision, Munich, 8-14 September 2018, 334-349. [Google Scholar] [CrossRef
[27] Kingma, D. and Ba, J. (2014) Adam: A Method for Stochastic Optimization. Computer Science.
[28] Szegedy, C., Vanhoucke, V., Ioffe, S., et al. (2016) Rethinking the Inception Ar-chitecture for Computer Vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 2818-2826. [Google Scholar] [CrossRef