基于矩阵费雪分布的三维人脸变形模型
3D Face Morphable Model Based on Matrix Fisher Distribution
DOI: 10.12677/orf.2024.143352, PDF,    国家自然科学基金支持
作者: 房 蔚:上海理工大学光电信息与计算机工程学院,上海
关键词: 三维可变形人脸模型隐式神经表示姿态估计矩阵的费雪分布3D Face Morphable Model Implicit Neural Representations Pose Estimation Fisher Distribution of Matrix
摘要: 三维人脸的精确表示有利于各种计算机视觉和图形应用。然而,由于数据离散化和模型线性化,在目前的研究中获取准确的身份和表情线索仍然具有挑战性。本文提出了一种新的三维可变形人脸模型,来学习具有隐式神经表示的非线性连续空间。它构建了两个明确的解纠缠变形场来分别建模与身份和表情相关联的复杂形状,并且引入了一个神经混合场,自适应地混合一系列局部场来学习复杂的细节。其次,我们发现姿态参数在网络中可以更好地被解纠缠,对于人脸变形过程中发生的姿态变换,我们利用基于旋转矩阵的费雪分布矩阵来表示人脸姿态的角度,并模拟头部旋转的不确定性。实验表明我们的方法在人脸细节建模和姿态估计方面具有优越性。
Abstract: The accurate representation of 3D faces is beneficial to various computer vision and graphics applications. However, due to data discretization and model linearity, it is still challenging to obtain accurate identity and expression cues in current research. In this paper, we propose a new 3D deformable face model to learn a nonlinear continuous space with implicit neural representations. It constructs two explicit disentanglement deformation fields to model the complex shapes associated with identity and expression respectively, and introduces a neural hybrid field to learn complex details by adaptively mixing a series of local fields. Secondly, we find that the pose parameters can be better disentangled in the network. For the pose transformation during face deformation, we use the Fisher distribution matrix based on the rotation matrix to represent the angle of the face pose and simulate the uncertainty of the head rotation. Experiments show that our method has advantages in face detail modeling and pose estimation.
文章引用:房蔚. 基于矩阵费雪分布的三维人脸变形模型[J]. 运筹与模糊学, 2024, 14(3): 1221-1234. https://doi.org/10.12677/orf.2024.143352

参考文献

[1] Blanz, V. and Vetter, T. (2023) A Morphable Model for the Synthesis of 3D Faces. In: Whitton, M.C., Ed., Seminal Graphics Papers: Pushing the Boundaries, Volume 2, Association for Computing Machinery, New York, 157-164. [Google Scholar] [CrossRef
[2] Patel, A. and Smith, W.A.P. (2009) 3D Morphable Face Models Revisited. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, 20-25 June 2009, 1327-1334. [Google Scholar] [CrossRef
[3] Bolkart, T. and Wuhrer, S. (2015) A Groupwise Multilinear Correspondence Optimization for 3D Faces. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 3604-3612. [Google Scholar] [CrossRef
[4] Tran, L., Liu, F. and Liu, X. (2019) Towards High-Fidelity Nonlinear 3D Face Morphable Model. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 1126-1135. [Google Scholar] [CrossRef
[5] Ranjan, A., Bolkart, T., Sanyal, S., et al. (2018) Generating 3D Faces Using Convolutional Mesh Autoencoders. Computer Vision-ECCV 2018, Munich, 8-14 September 2018, 725-741. [Google Scholar] [CrossRef
[6] Liu, F., Tran, L. and Liu, X. (2019) 3D Face Modeling from Diverse Raw Scan Data. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 9407-9417. [Google Scholar] [CrossRef
[7] Chen, Z. and Zhang, H. (2019) Learning Implicit Fields for Generative Shape Modeling. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 5932-5941. [Google Scholar] [CrossRef
[8] Genova, K., Cole, F., Vlasic, D., Sarna, A., Freeman, W. and Funkhouser, T. (2019) Learning Shape Templates with Structured Implicit Functions. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 7153-7163. [Google Scholar] [CrossRef
[9] Liu, F. and Liu, X. (2020) Learning Implicit Functions for Topology-Varying Dense 3D Shape Correspondence. Advances in Neural Information Processing Systems, 33, 4823-4834.
[10] Yang, H., Zhu, H., Wang, Y., Huang, M., Shen, Q., Yang, R., et al. (2020) FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 598-607. [Google Scholar] [CrossRef
[11] Ruiz, N., Chong, E. and Rehg, J.M. (2018) Fine-Grained Head Pose Estimation without Keypoints. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, 18-22 June 2018, 2155. [Google Scholar] [CrossRef
[12] Lepetit, V. and Fua, P. (2005) Monocular Model-Based 3D Tracking of Rigid Objects: A Survey. Foundations and Trends® in Computer Graphics and Vision, 1, 1-89. [Google Scholar] [CrossRef
[13] Zhou, Y., Barnes, C., Lu, J., Yang, J. and Li, H. (2019) On the Continuity of Rotation Representations in Neural Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 5738-5746. [Google Scholar] [CrossRef
[14] Mardia, K.V. and Jupp, P.E. (2009) Directional Statistics. John Wiley & Sons, Hoboken.
[15] Downs, T.D. (1972) Orientation Statistics. Biometrika, 59, 665-676. [Google Scholar] [CrossRef
[16] Amberg, B., Romdhani, S. and Vetter, T. (2007) Optimal Step Nonrigid ICP Algorithms for Surface Registration. 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, 17-22 June 2007, 1-8. [Google Scholar] [CrossRef
[17] Brunton, A., Bolkart, T. and Wuhrer, S. (2014) Multilinear Wavelets: A Statistical Shape Space for Human Faces. Computer Vision-ECCV 2014, Zurich, 6-12 September 2014, 297-312. [Google Scholar] [CrossRef
[18] Cao, C., Weng, Y., Zhou, S., Tong, Y. and Zhou, K. (2014) FaceWarehouse: A 3D Facial Expression Database for Visual Computing. IEEE Transactions on Visualization and Computer Graphics, 20, 413-425. [Google Scholar] [CrossRef] [PubMed]
[19] Li, T., Bolkart, T., Black, M.J., Li, H. and Romero, J. (2017) Learning a Model of Facial Shape and Expression from 4D Scans. ACM Transactions on Graphics, 36, Article No. 194. [Google Scholar] [CrossRef
[20] Cosker, D., Krumhuber, E. and Hilton, A. (2011) A FACS Valid 3D Dynamic Action Unit Database with Applications to 3D Dynamic Morphable Facial Modeling. 2011 International Conference on Computer Vision, Barcelona, 6-13 November 2011, 2296-2303. [Google Scholar] [CrossRef
[21] Genova, K., Cole, F., Sud, A., Sarna, A. and Funkhouser, T. (2020) Local Deep Implicit Functions for 3D Shape. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 4856-4865. [Google Scholar] [CrossRef
[22] Ibing, M., Lim, I. and Kobbelt, L. (2021) 3D Shape Generation with Grid-Based Implicit Functions. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 13554-13563. [Google Scholar] [CrossRef
[23] Takikawa, T., Litalien, J., Yin, K., Kreis, K., Loop, C., Nowrouzezahrai, D., et al. (2021) Neural Geometric Level of Detail: Real-Time Rendering with Implicit 3D Shapes. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 11353-11362. [Google Scholar] [CrossRef
[24] Zheng, Z., Yu, T., Dai, Q. and Liu, Y. (2021) Deep Implicit Templates for 3D Shape Representation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 1429-1439. [Google Scholar] [CrossRef
[25] Yi, L., Kim, V.G., Ceylan, D., Shen, I., Yan, M., Su, H., et al. (2016) A Scalable Active Framework for Region Annotation in 3D Shape Collections. ACM Transactions on Graphics, 35, Article No. 210. [Google Scholar] [CrossRef
[26] Ramon, E., Triginer, G., Escur, J., Pumarola, A., Garcia, J., Giro-i-Nieto, X., et al. (2021) H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5600-5609. [Google Scholar] [CrossRef
[27] Yenamandra, T., Tewari, A., Bernard, F., Seidel, H., Elgharib, M., Cremers, D., et al. (2021) I3DMM: Deep Implicit 3D Morphable Model of Human Heads. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 12798-12808. [Google Scholar] [CrossRef
[28] Yuan, H., Li, M., Hou, J. and Xiao, J. (2020) Single Image-Based Head Pose Estimation with Spherical Parametrization and 3D Morphing. Pattern Recognition, 103, Article 107316. [Google Scholar] [CrossRef
[29] Hsu, H., Wu, T., Wan, S., Wong, W.H. and Lee, C. (2019) QuatNet: Quaternion-Based Head Pose Estimation with Multiregression Loss. IEEE Transactions on Multimedia, 21, 1035-1046. [Google Scholar] [CrossRef
[30] Prokudin, S., Gehler, P. and Nowozin, S. (2018) Deep Directional Statistics: Pose Estimation with Uncertainty Quantification. Computer Vision-ECCV 2018, Munich, 8-14 September 2018, 542-559. [Google Scholar] [CrossRef
[31] Lee, T. (2018) Bayesian Attitude Estimation with the Matrix Fisher Distribution on SO(3). IEEE Transactions on Automatic Control, 63, 3377-3392. [Google Scholar] [CrossRef
[32] Mohlin, D., Sullivan, J. and Bianchi, G. (2020) Probabilistic Orientation Estimation with Matrix Fisher Distributions. Advances in Neural Information Processing Systems, 33, 4884-4893.
[33] Wang, W. and Lee, T. (2020) Matrix Fisher-Gaussian Distribution on SO(3) × Rn for Attitude Estimation with a Gyro Bias. 2020 American Control Conference (ACC), Denver, 1-3 July 2020, 4429-4434. [Google Scholar] [CrossRef
[34] Lewis, J.P., Cordner, M. and Fong, N. (2023) Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation. In: Whitton, Ed., M.C., Seminal Graphics Papers: Pushing the Boundaries, Volume 2. Association for Computing Machinery, New York, 811-818. [Google Scholar] [CrossRef
[35] Zheng, M., Yang, H., Huang, D. and Chen, L. (2022) ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 20311-20320. [Google Scholar] [CrossRef