人脸转正GAN模型的高效压缩
Efficient Compression of Face Frontalization GAN Model
摘要: 生成对抗网络(Generative Adversarial Network)在正面人脸图像生成方面大放异彩,生成的正面人脸极其逼真受到研究人缘的青睐。但其强大的图像生成能力源自于其训练和使用过程中巨大的计算量,GAN结构越复杂,其计算量需求,这极大地限制了其交互式部署。为增强其部署的便利性,减少GAN的计算量需求,本文提出了一种通用的压缩算法,该算法对人脸转正GAN进行了压缩,减少了GAN中生成器的推理时间和模型大小。本文的实验证明了本文算法在相较于原网络大幅减少了计算量的情况下,压缩后的GAN网络仍然保持了较好的图片质量。
Abstract: GAN performed extremely well in the frontal face image generation, the generated frontal face is very realistic favored by most researches. However, its powerful image generation ability comes from the huge calculation power and storage space required, and the more complex the GAN structure, the greater the demand for computation, which greatly limits its interactive deployment applications. To enhance the convenience of its deployment and reduce the computational requirements of the GAN, the paper proposes a general compression algorithm. The algorithm compresses the GAN model size of face frontalization and reduces the inference time. The experiments in this paper show that the compressed GAN network still obtains better image quality under the condition that the computation and storage load are greatly reduced compared with the original networks.
文章引用:魏雷, 邱卫根, 张立臣. 人脸转正GAN模型的高效压缩[J]. 计算机科学与应用, 2021, 11(3): 661-671. https://doi.org/10.12677/CSA.2021.113068

参考文献

[1] Ferrari, C., Lisanti, G., Berretti, S. and Del Bimbo, A. (2016) Effective 3D Based Frontalization for Unconstrained Face Recognition. 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, 1047-1052. [Google Scholar] [CrossRef
[2] Hassner, T., Harel, S., Paz, E. and Enbar, R. (2015) Effective Face Frontalization in Unconstrained Images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 4295-4304.
[3] Jeni, L.A. and Cohn, J.F. (2016) Person-Independent 3d Gaze Estimation Using Face Frontalization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Las Vegas, 26 June-1 July 2016, 87-95.
[4] Booth, J., Roussos, A., Ponniah, A., et al. (2018) Large Scale 3D Morphable Models. International Journal of Computer Vision, 126, 233-254. [Google Scholar] [CrossRef] [PubMed]
[5] Booth, J., Roussos, A., Zafeiriou, S., Ponniah, A. and Dunaway, D. (2016) A 3d Morphable Model Learnt from 10,000 Faces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 26 June-1 July 2016, 5543-5552.
[6] Cao, J., Hu, Y., Zhang, H., et al. (2018) Learning a High Fidelity Pose Invariant Model for High-Resolution Face Frontalization.
[7] Huang, R., Zhang, S., Li, T., et al. (2017) Beyond Face Rotation: Global and Local Perception Gan for Photorealistic and Identity Preserving Frontal View Synthesis. Proceedings of the IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 2439-2448. vision. 2439-2448. [Google Scholar] [CrossRef
[8] Tian, Y., Peng, X., Zhao, L., et al. (2018) CR-GAN: Learning Com-plete Representations for Multi-View Generation. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence Main Track, Stockholm, 13-19 July 2018, 942-948. [Google Scholar] [CrossRef
[9] Yin, X., Yu, X., Sohn, K., Liu, X.M. and Chandraker, M. (2017) Towards Large-Pose Face Frontalization in the Wild. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 3990-3999. [Google Scholar] [CrossRef
[10] Zhao, J., Cheng, Y., Xu, Y., Xiong, L., Li, J., Zhao, F., Jayashree, K., Pranata, S., Shen, S., Xing, J., et al. (2018) Towards Pose Invariant Face Recognition in the Wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-tion (CVPR), Salt Lake City, 18-23 June 2018, 2207-2216. [Google Scholar] [CrossRef
[11] Li, M., Lin, J., Ding, Y., et al. (2020) GAN Compression: Efficient Architectures for Interactive Conditional Gans. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 14-19 June 2020, 5284-5294. [Google Scholar] [CrossRef
[12] He, Y.H., Zhang, X.Y. and Sun, J. (2017) Channel Pruning for Accelerating Very Deep Neural Networks. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 1389-1397.
[13] Shen, Y., Luo, P., Yan, J., Wang, X. and Tang, X. (2018) Faceid-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, 18-23 June 2018, 821-830. [Google Scholar] [CrossRef
[14] He, Y.H., Lin, J., Liu, Z.J., Wang, H.R., Li, L.-J. and Han, S. (2018) AMC: AutoML for Model Compression and Acceleration on Mobile Devices. Proceedings of the European Conference on Computer Vision (ECCV), Munich, 8-14 September 2018, 784-800. [Google Scholar] [CrossRef
[15] Liu, Z.C., Mu, H.Y., Zhang, X.Y., Guo, Z.C., Yang, X., Kwang-Ting Cheng, T. and Sun, J. (2019) Metapruning: Meta Learning for Automatic Neural Network Channel Pruning. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 3296-3305. [Google Scholar] [CrossRef
[16] Hinton, G., Vinyals, O. and Dean, J. (2015) Distilling the Knowledge in a Neural Network.
[17] Chen, G., Choi, W., Yu, X., et al. (2017) Learning Efficient Object Detection Models with Knowledge Distillation. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 742-751.
[18] Aguinaldo, A., Chiang, P.-Y., Gain, A., Patil, A., Pearson, K. and Feizi, S. (2019) Compressing Gans Using Knowledge Distillation.
[19] Zoph, B. and Le, Q.V. (2016) Neural Architecture Search with Reinforcement Learning.
[20] Liu, C., Zoph, B., Neumann, M., et al. (2018) Progres-sive Neural Architecture Search. Proceedings of the European Conference on Computer Vision (ECCV), Munich, 8-14 September 2018, 19-34. [Google Scholar] [CrossRef
[21] Liu, H., Simonyan, K., Vinyals, O., et al. (2017) Hierarchical Representations for Efficient Architecture Search.
[22] Liu, H., Simonyan, K. and Yang, Y. (2018) Darts: Differentiable Architecture Search.
[23] Cai, H., Zhu, L. and Han, S. (2018) Proxylessnas: Direct Neural Architecture Search on Target Task and Hardware.
[24] Wu, B.C., et al. (2019) Fbnet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seoul, 27 October-2 November 2019, 10734-10742. [Google Scholar] [CrossRef
[25] Cai, H., Gan, C., Wang, T., et al. (2019) Once-for-All: Train One Network and Specialize It for Efficient Deploy-ment.
[26] Azadi, S., Olsson, C., Darrell, T., et al. (2018) Discriminator Rejection Sampling.
[27] Chen, Y., Wang, N. and Zhang, Z. (2018) DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer. Proceedings of the AAAI Conference on Artificial Intelligence, 32, 2852-2859.
https://ojs.aaai.org/index.php/AAAI/article/view/11783
[28] Long, J., Shelhamer, E. and Darrell, T. (2015) Fully Convolutional Networks for Semantic Segmentation. CVPR, Boston, 7-12 June 2015, 3431-3440. [Google Scholar] [CrossRef
[29] He, K.M., Zhang, X.Y., Ren, S.Q. and Sun, J. (2016) Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition (CVPR), Las Vegas, 26 June-1 July 2016, 770-778. [Google Scholar] [CrossRef
[30] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-net: Convolution-al Networks for Biomedical Image Segmentation. MICCAI, Munich, 5-9 October 2015, 234-241. [Google Scholar] [CrossRef
[31] Howard, A.G., Zhu, M.L., Chen, B., Kalenichenko, D., et al. (2017) Efficient Convolutional Neural Networks for Mobile Vision Applications.
[32] Johnson, J., et al. (2016) Per-ceptual Losses for Real-Time Style Transfer and Super-Resolution. In: European Conference on Computer Vision, Springer, Cham, 694-711. [Google Scholar] [CrossRef
[33] Liu, Z., Li, J.G., Shen, Z.Q., Huang, G., Yan, S.M. and Zhang, C.S. (2017) Learning Efficient Convolutional Networks through Network Slimming. Pro-ceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2736-2744.
[34] Zhuang, Z.W., Tan, M.K., Zhuang, B.H., Liu, J., Guo, Y., Wu, Q.Y., et al. (2018) Discrimina-tion-Aware Channel Pruning for Deep Neural Networks. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, 3-8 December 2018, 875-886.
[35] Luo, J.H., Wu, J. and Lin, W. (2017) ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression. Proceedings of the IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 5058-5066. [Google Scholar] [CrossRef
[36] Guo, Z.C., Zhang, X.Y., Mu, H.Y., Heng, W., Liu, Z.C., Wei, Y.C. and Sun, J. (2019) Single Path Oneshot Neural Architecture Search with Uniform Sampling.
[37] Gross, R., Matthews, I., Cohn, J., et al. (2010) Multi-Pie. Image and Vision Computing, 28, 807-813. [Google Scholar] [CrossRef] [PubMed]
[38] Gao, W., Cao, B., Shan, S., et al. (2007) The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 38, 149-161. [Google Scholar] [CrossRef