AGES:各向异性高斯平滑增强的三维几何一致性重建方法
AGES: Anisotropic Gaussian Enhancement with Smoothness for Geometric-Consistent 3D Reconstruction Method
摘要: 在稀疏视角和低纹理区域条件下,从多视角图像实现高保真三维重建仍然是一个具有挑战性的问题,传统方法在此类场景中往往表现不稳定。尽管三维高斯溅射(3D Gaussian Splatting, 3DGS)能够实现实时渲染,但其几何优化与各向异性外观建模相互解耦,容易在无纹理或高反射区域产生伪影。针对上述问题,本文提出一种各向异性高斯平滑增强方法(Anisotropic Gaussian Enhancement with Smoothness, AGES),构建了一个基于概率路由的联合优化框架,在单一训练流程中联合优化场景几何结构与视角相关外观建模。该方法引入两个关键组成部分:(1) 自适应几何–外观路由模块(Adaptive Geometry-Appearance Routing, AGAR),基于学习得到的逐高斯不确定性度量,动态地将高斯基元分配至跨视角几何细化分支或各向异性反射建模分支;(2) 深度平滑正则化项(Depth Smoothness Regularization, DSR),通过约束渲染深度与几何优化深度之间的局部梯度一致性,在抑制噪声的同时有效保持结构边缘。大量在Waymo和YouTube数据集上的实验结果表明,所提出的AGES方法在全分辨率及降采样设置下均显著优于现有先进方法,在复杂真实场景中实现了更高的几何一致性和视觉保真度。
Abstract: High-fidelity 3D reconstruction from multi-view images remains a significant challenge, particularly under sparse viewpoints and in low-texture regions where conventional methods are often unreliable. While 3D Gaussian Splatting (3DGS) enables real-time rendering, the decoupling of its geometric optimization and anisotropic appearance modeling frequently leads to artifacts in textureless or specular areas. To address these limitations, we propose Anisotropic Gaussian Enhancement with Smoothness (AGES), which formulates a probability-guided joint optimization framework to jointly optimize scene geometry and view-dependent appearance within a single training pipeline. Our approach introduces two key components: (1) an Adaptive Geometry-Appearance Routing (AGAR) module, which dynamically routes each Gaussian to either a cross-view geometric refinement branch or an anisotropic reflectance modeling branch based on a learned per-primitive uncertainty measure; and (2) a Depth Smoothness Regularization (DSR) loss, which enforces local gradient consistency between the rendered and geometrically refined depth maps to preserve structural edges while suppressing noise. Extensive experiments on the Waymo and YouTube datasets demonstrate that AGES significantly outperforms state-of-the-art methods at both full and downsampled resolutions, achieving superior geometric consistency and visual fidelity in challenging real-world scenarios.
文章引用:屈世泽, 苏佳文, 张明. AGES:各向异性高斯平滑增强的三维几何一致性重建方法[J]. 计算机科学与应用, 2026, 16(2): 289-302. https://doi.org/10.12677/csa.2026.162059

参考文献

[1] Weng, C., Curless, B., Srinivasan, P.P., Barron, J.T. and Kemelmacher-Shlizerman, I. (2022) HumanNeRF: Free-Viewpoint Rendering of Moving People from Monocular Video. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 16189-161999. [Google Scholar] [CrossRef
[2] Li, T., Slavcheva, M., Zollhoefer, M., Green, S., Lassner, C., Kim, C., et al. (2022) Neural 3D Video Synthesis from Multi-View Video. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 5511-5521. [Google Scholar] [CrossRef
[3] Lu, F., Xu, Y., Chen, G., Li, H., Lin, K. and Jiang, C. (2023) Urban Radiance Field Representation with Deformable Neural Mesh Primitives. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 165-176. [Google Scholar] [CrossRef
[4] Jiang, Y., Liao, Q., Li, X., Ma, L., Zhang, Q., Zhang, C., et al. (2025) UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling. Knowledge-Based Systems, 320, Article ID: 113470. [Google Scholar] [CrossRef
[5] Wen, X., Sun, K., Chen, T., Wang, Z., She, J., Zhao, Q., et al. (2025) A Nerf-Based Technique Combined Depth-Guided Filtering and View Enhanced Module for Large-Scale Scene Reconstruction. Knowledge-Based Systems, 316, Article ID: 113411. [Google Scholar] [CrossRef
[6] Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R. and Ng, R. (2020) NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In: Vedaldi, A., Bischof, H., Brox, T. and Frahm, J.M., Eds., Computer VisionECCV 2020, Springer, 405-421. [Google Scholar] [CrossRef
[7] Zhu, F., Guo, S., Song, L., Xu, K. and Hu, J. (2023) Deep Review and Analysis of Recent Nerfs. APSIPA Transactions on Signal and Information Processing, 12, 1-32. [Google Scholar] [CrossRef
[8] Zhang, X., Fanello, S., Tsai, Y., Sun, T., Xue, T., Pandey, R., et al. (2021) Neural Light Transport for Relighting and View Synthesis. ACM Transactions on Graphics, 40, 1-17. [Google Scholar] [CrossRef
[9] Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T. and Srinivasan, P.P. (2022) Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 5481-5490. [Google Scholar] [CrossRef
[10] Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P. and Hedman, P. (2022) Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 5460-5469. [Google Scholar] [CrossRef
[11] Tancik, M., Casser, V., Yan, X., Pradhan, S., Mildenhall, B.P., Srinivasan, P., et al. (2022) Block-NeRF: Scalable Large Scene Neural View Synthesis. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 8238-8248. [Google Scholar] [CrossRef
[12] Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R. and Srinivasan, P.P. (2021) Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5835-5844. [Google Scholar] [CrossRef
[13] Park, K., Sinha, U., Barron, J.T., Bouaziz, S., Goldman, D.B., Seitz, S.M., et al. (2021) Nerfies: Deformable Neural Radiance Fields. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5845-5854. [Google Scholar] [CrossRef
[14] Martin-Brualla, R., Radwan, N., Sajjadi, M.S.M., Barron, J.T., Dosovitskiy, A. and Duckworth, D. (2021) NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 7206-7215. [Google Scholar] [CrossRef
[15] Neff, T., Stadlbauer, P., Parger, M., Kurz, A., Mueller, J.H., Chaitanya, C.R.A., et al. (2021) DONeRF: Towards Real‐time Rendering of Compact Neural Radiance Fields Using Depth Oracle Networks. Computer Graphics Forum, 40, 45-59. [Google Scholar] [CrossRef
[16] Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J. and Valentin, J. (2021) FastNeRF: High-Fidelity Neural Rendering at 200FPS. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 14326-14335. [Google Scholar] [CrossRef
[17] Zhang, Y., Wei, J., Zhou, B., Li, F., Xie, Y. and Liu, J. (2024) TVNeRF: Improving Few-View Neural Volume Rendering with Total Variation Maximization. Knowledge-Based Systems, 301, Article ID: 112273. [Google Scholar] [CrossRef
[18] Wang, F., Yin, L., Qin, Y., Gao, X., Tang, X. and Zhou, H. (2025) Ray-Decomposed and Gradient-Constrained Nerf for Few-Shot View Synthesis under Low-Light Conditions. Knowledge-Based Systems, 330, Article ID: 114568. [Google Scholar] [CrossRef
[19] Hermann, M., Kwak, H., Ruf, B. and Weinmann, M. (2024) Leveraging Neural Radiance Fields for Large-Scale 3D Reconstruction from Aerial Imagery. Remote Sensing, 16, Article 4655. [Google Scholar] [CrossRef
[20] Xie, S., Zhang, L., Jeon, G. and Yang, X. (2023) Remote Sensing Neural Radiance Fields for Multi-View Satellite Photogrammetry. Remote Sensing, 15, Article 3808. [Google Scholar] [CrossRef
[21] Liu, L., Gu, J., Zaw Lin, K., Chua, T.S. and Theobalt, C. (2020) Neural Sparse Voxel Fields. Advances in Neural Information Processing Systems, 33, 15651-15663
[22] Yu, A., Li, R., Tancik, M., Li, H., Ng, R. and Kanazawa, A. (2021) PlenOctrees for Real-Time Rendering of Neural Radiance Fields. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5732-5741. [Google Scholar] [CrossRef
[23] Hu, T., Liu, S., Chen, Y., Shen, T. and Jia, J. (2022) EfficientNeRF—Efficient Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 12892-12901. [Google Scholar] [CrossRef
[24] Reiser, C., Peng, S., Liao, Y. and Geiger, A. (2021) KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 14315-14325. [Google Scholar] [CrossRef
[25] Hedman, P., Srinivasan, P.P., Mildenhall, B., Barron, J.T. and Debevec, P. (2021) Baking Neural Radiance Fields for Real-Time View Synthesis. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5855-5864. [Google Scholar] [CrossRef
[26] Kerbl, B., Kopanas, G., Leimkuehler, T. and Drettakis, G. (2023) 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics, 42, 1-14. [Google Scholar] [CrossRef
[27] Gao, X., Huang, Y., Jiao, S., Jin, X., Lyu, X., Qi, X., et al. (2024) Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting. Advances in Neural Information Processing Systems 37, Vancouver, 10-15 December 2024, 61192-61216. [Google Scholar] [CrossRef
[28] Yan, Z., Low, W.F., Chen, Y. and Lee, G.H. (2024) Multi-Scale 3D Gaussian Splatting for Anti-Aliased Rendering. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 20923-20931. [Google Scholar] [CrossRef
[29] Yu, Z., Chen, A., Huang, B., Sattler, T. and Geiger, A. (2024) Mip-Splatting: Alias-Free 3D Gaussian Splatting. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 19447-19456. [Google Scholar] [CrossRef
[30] Meng, J., Li, H., Wu, Y., Gao, Q., Yang, S., Zhang, J., et al. (2024) Mirror-3dgs: Incorporating Mirror Reflections into 3D Gaussian Splatting. 2024 IEEE International Conference on Visual Communications and Image Processing (VCIP), Tokyo, 8-11 December 2024, 1-5. [Google Scholar] [CrossRef
[31] Gao, J., Gu, C., Lin, Y., Li, Z., Zhu, H., Cao, X., et al. (2024) Relightable 3D Gaussians: Realistic Point Cloud Relighting with BRDF Decomposition and Ray Tracing. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T. and Varol, G., Eds., Computer VisionECCV 2024, Springer, 73-89. [Google Scholar] [CrossRef
[32] Cheng, K., Long, X., Yang, K., Yao, Y., Yin, W., Ma, Y., Wang, W. and Chen, X. (2024) GaussianPro: 3d Gaussian Splatting with Progressive Propagation. arXiv: 2402.14650.
[33] Zhang, J., Zhan, F., Xu, M., Lu, S. and Xing, E. (2024) FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 21424-21433. [Google Scholar] [CrossRef
[34] Yu, Z., Chen, Z., Zhou, Z. and Cao, H. (2025) CGC-GS: Cross Geometric Cues Constrained Gaussian Splatting. Knowledge-Based Systems, 330, Article ID: 114630. [Google Scholar] [CrossRef
[35] Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., et al. (2020) Scalability in Perception for Autonomous Driving: Waymo Open Dataset. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 2443-2451. [Google Scholar] [CrossRef
[36] Radl, L., Steiner, M., Parger, M., Weinrauch, A., Kerbl, B. and Steinberger, M. (2024) StopThePop: Sorted Gaussian Splatting for View-Consistent Real-Time Rendering. ACM Transactions on Graphics, 43, 1-17. [Google Scholar] [CrossRef