解决潜在扩散模型逆问题的重采样投影优化
The Latent Diffusion Model Solves the Resampling Projection Optimization of the Inverse Problem
DOI: 10.12677/pm.2026.164104, PDF,   
作者: 旷 荣:成都理工大学数学科学学院,四川 成都
关键词: 扩散模型逆问题Diffusion Model Inverse Problem
摘要: 随着深度学习在科学研究领域日益普及,它被用于解决越来越多的问题,也成为求解反问题的热门研究方向。近年来,潜扩散模型已被证明能够生成高质量图像,而扩散模型在隐空间中的应用也比在像素空间中更为高效。因此,我们对先前提出的重采样算法进行了进一步改进:将浅层空间中的迭代优化改为“零空间投影”,并将优化步长调整为每5步执行一次。实验表明,我们的方法在原有基础上取得了进一步的良好效果,且在一些细节上实现了更显著的提升。
Abstract: As deep learning is becoming more and more popular in the field of scientific research, it is used to solve more and more problems, and it has also become a popular research field for solving anti-problems. Recently, the latent diffusion model has been shown to generate high-quality images, and the application of the diffusion model in subconscious space is more efficient than that in pixel space. Therefore, we have made further modifications to the proposed resampling algorithm, we have changed the iterative optimization in shallow space to zero space projection, and we have changed the number of optimized steps to every 5 steps, and our experiment has achieved further good results on the basis of the original. Moreover, our experimental results have seen greater improvements in some details.
文章引用:旷荣. 解决潜在扩散模型逆问题的重采样投影优化[J]. 理论数学, 2026, 16(4): 195-203. https://doi.org/10.12677/pm.2026.164104

参考文献

[1] Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. and Ganguli, S. (2015) Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. International Conference on Machine Learning (ICML), Vol. 37, 2256-2265.
[2] Song, Y. and Ermon, S. (2019) Generative Modeling by Estimating Gradients of the Data Distribution. Advances in Neural Information Processing Systems, Vancouver, 8-14 December 2019, 11895-11907.
[3] Ho, J., Jain, A. and Abbeel, P. (2020) Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems (NeurIPS), Volume 33, 6840-6851.
[4] Dhariwal, P. and Nichol, A.Q. (2021) Diffusion Models Beat GANs on Image Synthesis. Advances in Neural Information Processing Systems, Volume 34, 8780-8794.
[5] Rombach, R., Blattmann, A., Lorenz, D., Esser, P. and Ommer, B. (2022) High-Resolution Image Synthesis with Latent Diffusion Models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 10674-10685. [Google Scholar] [CrossRef
[6] Lipman, Y., Chen, R.T., Ben-Hamu, H., Nickel, M. and Le, M. (2022) Flow Matching for Generative Modeling.
[7] Liu, X.C., Gong, C.Y. and Liu, Q. (2022) Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow.
[8] Albergo, M.S., Boffi, N.M. and Vanden-Eijnden, E. (2023) Stochastic Interpolants: A Unifying Framework for Flows and Diffusions.
[9] Ma, N., Goldstein, M., Albergo, M.S., Boffi, N.M., Vanden-Eijnden, E. and Xie, S. (2024) SiT: Exploring Flow and Diffusion-Based Generative Models with Scalable Interpolant Transformers. Computer VisionECCV 2024 18th European Conference, Milan, 29 September-4 October 2024, 23-40.
[10] Song, Y., Shen, L.Y., Xing, L. and Ermon, S. (2021) Solving Inverse Problems in Medical Imaging with Score-Based Generative Models. International Conference on Learning Representations (ICLR), Vienna, 3-7 May 2021, 18 p.
[11] Kawar, B., Elad, M., Ermon, S. and Song, J. (2022) Denoising Diffusion Restoration Models. Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, 28 November-9 December 2022, 23593-2360.
[12] Chung, H., Kim, J., Mccann, M.T., Klasky, M.L. and Ye, J.C. (2023) Diffusion Posterior Sampling for General Noisy Inverse Problems. The 11th International Conference on Learning Representations, Kigali, 1-5 May 2023, 30 p.
[13] Dhariwal, P. and Nichol, A. (2021) Diffusion Models Beat GANs on Image Synthesis. Proceedings of the 35th International Conference on Neural Information Processing Systems, 6-14 December 2021, 8780-8794.
[14] Karras, T., Aittala, M., Aila, T. and Laine, S. (2022) Elucidating the Design Space of Diffusion-Based Generative Models. Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, 28 November-9 December 2022, 26565-26577.
[15] Song, Y., Dhariwal, P., Chen, M. and Sutskever, I. (2023) Consistency Models. Proceedings of the 40th International Conference on Machine Learning, Honolulu, 23-29 July 2023, 32211-32252.
[16] Lou, A. and Ermon, S. (2023) Reflected Diffusion Models. Proceedings of the 40th International Conference on Machine Learning, Honolulu, 23-29 July 2023, 22675-22701.
[17] Chan, W., Denton, E., Fleet, D., Ghasemipour, K., Lopes, R.G., Ho, J., et al. (2022) Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. Advances in Neural Information Processing Systems 35, New Orleans, 28 November-9 December 2022, 36479-36494. [Google Scholar] [CrossRef
[18] Vincent, P. (2011) A Connection between Score Matching and Denoising Autoencoders. Neural Computation, 23, 1661-1674. [Google Scholar] [CrossRef] [PubMed]
[19] Chung, H., Ryu, D., Sim, B. and Ye, J.C. (2022) Improving Diffusion Models for Inverse Problems Using Manifold Constraints. Advances in Neural Information Processing Systems 35, New Orleans, 28 November-9 December 2022, 25683-25696. [Google Scholar] [CrossRef
[20] Wang, Y.H., Yu, J.W. and Zhang, J. (2022) Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model. The 11th International Conference on Learning Representations, ICLR 2023, Kigali, 1-5 May 2023, 31 p.
[21] Song, J.M., Vahdat, A., Mardani, M. and Kautz, J. (2023) Pseudoinverse-Guided Diffusion Models for Inverse Problems. International Conference on Learning Representations, Kigali, 1-5 May 2023, 30 p.
[22] Song, B.Q., Kwon, S.M., Zhang, Z.X., Zhen, X.T., Lei, Q. and Hu, L.J. (2024) Solving Inverse Problems with Latent Diffusion Models via Hard Data Consistency. International Conference on Learning Representations, Vienna, 7-11 May 2024, 31 p.