基于视觉Transformer的面孔吸引力预测方法研究
Research on Face Attractiveness Prediction Method Based on Visual Transformer
摘要: 面孔吸引力分析预测是结合认知科学、心理学、计算机科学的一个交叉领域。是对人主观感受的客观量化——通过机器去学习面孔特征与量化的感知间的映射关系。本文提出了一种结合CNN与Transformer结构的混合模型,使用残差卷积网络提取图像的特征图,经嵌入层编码后输入到多层transformer编码器中,利用自注意力机制从全局的角度把握不同特征成分间的关系。该方法在SCUT-FBP5500数据集上取得了较好的实验效果,表明了从全局的角度将人脸图像转化为视觉词向量序列并进行属性预测是可行有效的。
Abstract: Insert Face attractiveness analysis and prediction is a cross field combining cognitive science, psychology and computer science. It is the objective quantification of people’s subjective feelings, learning the mapping relationship between face features and quantitative perception through machines. In this paper, a hybrid model combining CNN and transformer structure is proposed. The residual convolution network is used to extract the feature map of the image, which is encoded by the embedded layer and input into the multi-layer transformer encoder. The self attention mechanism is used to grasp the relationship between different feature components from a global perspective. This method has achieved good experimental results on scut-fbp5500 data set, which shows that it is feasible and effective to transform face image into visual word vector sequence and predict attributes from a global perspective.
文章引用:方建安, 李昶昊. 基于视觉Transformer的面孔吸引力预测方法研究[J]. 计算机科学与应用, 2022, 12(4): 1149-1156. https://doi.org/10.12677/CSA.2022.124117

参考文献

[1] Sala, E., Terraneo, M., Lucchini, M. and Knies, G. (2013) Exploring the Impact of Male and Female Facial Attractiveness on Occupational Prestige. Research in Social Stratification and Mobility, 31, 69-81. [Google Scholar] [CrossRef
[2] Eisenthal, Y., Dror, G. and Ruppin, E. (2006) Facial Attractive-ness: Beauty and the Machine. Neural Computation, 18, 119-142. [Google Scholar] [CrossRef] [PubMed]
[3] Kagian, A., Dror, G., Leyvand, T., Cohen-Or, D. and Ruppin, E. (2006) A Humanlike Predictor of Facial Attractiveness. Advances in Neural Information Processing Systems 19: Pro-ceedings of the 2006 Conference, 649-656.
[4] Schmid, K., Marx, D. and Samal, A. (2008) Computation of Face At-tractiveness Index Based on Neoclassic Canons, Symmetry and Golden Ratio. Pattern Recognition, 41, 2710-2717. [Google Scholar] [CrossRef
[5] Xu, J., Jin, L., Liang, L., Feng, Z. and Xie, D. (2015) A New Humanlike Facial Attractiveness Predictor with Cascaded Fine-Tuning Deep Learning Model. arXivpreprint arXiv:1511.02465.
[6] Xu, J., Jin, L., Liang, L., Feng, Z., Xie, D. and Mao, H. (2017) Facial Attractiveness Prediction Using Psychologically Inspired Convolutional Neural Network (PI-CNN). 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orlean, 5-9 March 2017, 1657-1661. [Google Scholar] [CrossRef
[7] Lin, L.J., Liang, L.Y. and Jin, L.W. (2019) Regression Guided by Relative Ranking Using Convolutional Neural Network (R3CNN) for Facial Beauty Prediction. IEEE Trans-actions on Affective Computing.
[8] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. (2020) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929.
[9] Srinivas, A., Lin, T.Y., Parmar, N., Shlens, J., Abbeel, P. and Vaswani, A. (2021) Bottleneck Transformers for Visual Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog-nition, Nashville, 20-25 June 2021, 16519-16529. [Google Scholar] [CrossRef
[10] Dai, Z., Liu, H., Le, Q. and Tan, M. (2021) Coatnet: Marrying Convolution and Attention for All Data Sizes. Advances in Neural Information Processing Systems, 34, 3965-3977.
[11] Liu, Y., Sun, G., Qiu, Y., Zhang, L., Chhatkuli, A. and Van Gool, L. (2021) Transformer in Convolutional Neural Networks. arXiv preprint arXiv:2106.03180.
[12] Fan, Y.Y., Liu, S., Li, B., Guo, Z., Samal, A., Wan, J., et al. (2017) Label Distribution-Based Facial Attractiveness Computation by Deep Re-sidual Learning. IEEE Transactions on Multimedia, 20, 2196-2208. [Google Scholar] [CrossRef
[13] Wu, Y., Li, J., Kong, Y. and Fu, Y. (2016) Deep Convolutional Neural Network with Independent Softmax for Large Scale Face Recognition. Proceedings of the 24th ACM international conference on Multimedia, Amsterdam, 15-19 October 2016, 1063-1067. [Google Scholar] [CrossRef
[14] Vaswani, A., Shazeer, N., Parmar, N., (2017) Attention Is All You Need. Advances in Neural Information Processing Systems, 5998-6008.
[15] Liang, L., Lin, L., Jin, L., Xie, D. and Li, M. (2018) SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction. 2018 24th In-ternational Conference on Pattern Recognition (ICPR), Beijing, 20-24 August 2018, 1598-1603. [Google Scholar] [CrossRef