基于深度优化NeRF空间失真的三维重建方法研究
Research on 3D Reconstruction Method Based on Deep Optimization of NeRF Spatial Distortion
摘要: 在现有神经辐射场隐函数表达能力的限制下,针对其光线追踪终止深度的不确定性,深度均方差无法收敛,进而导致重建结果空间失真、远处渲染效果差的问题,提出了基于深度信息优化空间失真的神经辐射场。首先使用深度信息引导自适应采样方法:通过图像分割和Canny算法获取物体深度并解析边缘信息,据此优化采样点数量和权重分配;然后融合由深度损失Ld上的高斯负对数似然(GNLL)项、轮廓损失和颜色损失作为网络的损失函数,最小化预测误差;最后提出了一种高斯平滑的周期编码方式,降低噪声对位置信息的干扰。实验结果表明,本文融合的深度信息在三维重建中得以充分利用,保证渲染质量的同时降低空间失真,以增强空间逼真感。
Abstract: Under the limitations of the implicit function expression ability of existing Neural Radiance Fields (NeRF), the Research on 3D reconstruction method based on deep optimization of NeRF spatial distortion is proposed to address the uncertainty of ray tracing termination depth and the inability of depth mean square error to converge, resulting in spatial distortion of reconstruction results and poor rendering effects at a distance. Firstly, use depth information to guide adaptive sampling methods: obtain object depth and parse edge information through image segmentation and Canny algorithm, and optimize the number of sampling points and weight allocation accordingly; then, the Gaussian Negative Logarithmic Likelihood (GNLL) term that combines the mean square error of light color and depth information is used as the loss function of the network to minimize prediction error; finally, a Gaussian smoothing periodic encoding method was proposed to reduce the interference of noise on position information. The experimental results show that the depth information fused in this paper is fully utilized in 3D reconstruction, ensuring rendering quality while reducing spatial distortion to enhance spatial realism.
文章引用:俞振宇, 曹春萍. 基于深度优化NeRF空间失真的三维重建方法研究[J]. 建模与仿真, 2025, 14(4): 61-72. https://doi.org/10.12677/mos.2025.144265

参考文献

[1] 宁小娟, 巩亮, 韩怡, 等. 结合语义分割与模型匹配的室内场景重建方法[J]. 中国图象图形学报, 2023, 28(10): 3149-3162.
[2] 贾鑫. 基于深度学习的双目三维物体稀疏与稠密点云重建[D]: [博士学位论文]. 天津: 天津理工大学, 2022.
[3] Dai, A., Ritchie, D., Bokeloh, M., Reed, S., Sturm, J. and Niessner, M. (2018) ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4578-4587. [Google Scholar] [CrossRef
[4] Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S. and Geiger, A. (2019) Occupancy Networks: Learning 3D Reconstruction in Function Space. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 4455-4465. [Google Scholar] [CrossRef
[5] Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R. and Ng, R. (2020) NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In: Vedaldi, A., Bischof, H., Brox, T. and Frahm, J.M., Eds., Computer VisionECCV 2020, Springer, 405-421. [Google Scholar] [CrossRef
[6] 龚靖渝, 楼雨京, 柳奉奇, 等. 三维场景点云理解与重建技术[J]. 中国图象图形学报, 2023, 28(6): 1741-1766.
[7] Charles, R.Q., Su, H., Kaichun, M. and Guibas, L.J. (2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 77-85. [Google Scholar] [CrossRef
[8] Zhao, H., Jiang, L., Fu, C. and Jia, J. (2019) PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 5560-5568. [Google Scholar] [CrossRef
[9] 任飞, 常青玲, 刘兴林, 等. 基于点云的室内结构三维重建综述[J]. 计算机科学, 2022, 49(S2): 351-361.
[10] Coudron, I., Puttemans, S., Goedemé, T. and Vandewalle, P. (2020) Semantic Extraction of Permanent Structures for the Reconstruction of Building Interiors from Point Clouds. Sensors, 20, Article 6916. [Google Scholar] [CrossRef] [PubMed]
[11] Fang, H., Pan, C. and Huang, H. (2021) Structure-aware Indoor Scene Reconstruction via Two Levels of Abstraction. ISPRS Journal of Photogrammetry and Remote Sensing, 178, 155-170. [Google Scholar] [CrossRef
[12] 韩开, 徐娟. 3D场景渲染技术-神经辐射场的研究综述[J]. 计算机应用研究, 2024, 41(8): 2252-2260.
[13] Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R. and Srinivasan, P.P. (2021) Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5835-5844. [Google Scholar] [CrossRef
[14] Neff, T., Stadlbauer, P., Parger, M., Kurz, A., Mueller, J.H., Chaitanya, C.R.A., et al. (2021) DONeRF: Towards Real‐time Rendering of Compact Neural Radiance Fields Using Depth Oracle Networks. Computer Graphics Forum, 40, 45-59. [Google Scholar] [CrossRef
[15] Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T. and Srinivasan, P.P. (2022) Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 5481-5490. [Google Scholar] [CrossRef
[16] Deng, K., Liu, A., Zhu, J. and Ramanan, D. (2022) Depth-Supervised NeRF: Fewer Views and Faster Training for Free. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 12872-12881. [Google Scholar] [CrossRef
[17] Wei, Y., Liu, S., Rao, Y., Zhao, W., Lu, J. and Zhou, J. (2021) NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 5590-5599. [Google Scholar] [CrossRef
[18] Martin-Brualla, R., Radwan, N., Sajjadi, M.S.M., Barron, J.T., Dosovitskiy, A. and Duckworth, D. (2021) NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 7206-7215. [Google Scholar] [CrossRef
[19] Tubiana, J., Schneidman-Duhovny, D. and Wolfson, H.J. (2022) Scannet: An Interpretable Geometric Deep Learning Model for Structure-Based Protein Binding Site Prediction. Nature Methods, 19, 730-739. [Google Scholar] [CrossRef] [PubMed]
[20] Cabaret, L., Lacassagne, L. and Oudni, L. (2014) A Review of World’s Fastest Connected Component Labeling Algorithms: Speed and Energy Estimation. Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, Madrid, 8-10 October 2014, 1-6. [Google Scholar] [CrossRef
[21] Bogoslavskyi, I. and Stachniss, C. (2016) Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, 9-14 October 2016, 163-169. [Google Scholar] [CrossRef
[22] Choi, K. and Ha, J. (2023) An Adaptive Threshold for the Canny Edge with Actor-Critic Algorithm. IEEE Access, 11, 67058-67069. [Google Scholar] [CrossRef
[23] 周钰聪, 叶超, 林子涵, 等. 复杂场景下偏振导航自适应图像分割算法研究[J]. 光学学报, 2024, 44(19): 150-160.
[24] Roessle, B., Barron, J.T., Mildenhall, B., Srinivasan, P.P. and Niebner, M. (2022) Dense Depth Priors for Neural Radiance Fields from Sparse Input Views. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 12882-12891. [Google Scholar] [CrossRef
[25] Dang, W., Liao, S., Yang, B., Yin, Z., Liu, M., Yin, L., et al. (2023) An Encoder-Decoder Fusion Battery Life Prediction Method Based on Gaussian Process Regression and Improvement. Journal of Energy Storage, 59, Article ID: 106469. [Google Scholar] [CrossRef
[26] Zhang, R., Isola, P., Efros, A.A., Shechtman, E. and Wang, O. (2018) The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 586-595. [Google Scholar] [CrossRef