基于稳定扩散模型的AR显示装置像质提升与智能交互方法
Image Quality Improvement and Intelligent Interaction Method of AR Display Device Based on Stable Diffusion Model
摘要: 本文针对现有AR显示装置在图像质量提升领域的固有缺陷,以扩散生成模型为基础,改进设计了一种基于稳定扩散模型的AR显示装置像质提升与智能交互方法。该像质提升方法将通用Diffusion扩散模型和Encoder-Dedecoder的结构相结合,将场景图像输入经过训练的像质提升模型中,通过编码器将输入的场景图像转换为隐变量特征,然后通过反向扩散模块中训练好的深度神经网络层按时间节点的逆向顺序对隐变量特征进行反向迭代计算逐层生成降噪隐变量特征,直至得到最终的降噪隐变量特征,最后通过解码器将最终的降噪隐变量特征转换为降噪处理后的像质提升图像。通过相关试验证明本文方法相较于通用Diffusion扩散模型迭代速度更快,生成性能指标更好。在此基础上,本文还进一步设计了相应的智能交互方法,能够通过稳定扩散模型实现场景图像的降噪增强以及场景图像与所需交互信息之间的智能融合,且还能够将稳定扩散模型进行本地终端部署以避免不必要的数据远程传输消耗。
Abstract:
The invention specifically relates to a method for image quality improvement and intelligent inter-action of an AR display device based on a stable diffusion model. Image quality improvement methods include: The scene image is input into the trained image quality improvement model, and the input scene image is converted into hidden variable features through the encoder. Then, the trained deep neural network layer in the reverse diffusion module performs reverse iterative calculation on the hidden variable features according to the reverse order of time nodes to generate the noise reduction hidden variable features layer by layer until the final noise reduction hidden variable features are obtained. Finally, the final hidden variable features of noise reduction are converted into the image quality improvement image after noise reduction by decoder. The invention further discloses a corresponding intelligent interaction method. The invention can realize the noise reduction and enhancement of the scene image and the intelligent fusion between the scene image and the required interactive information through the stable diffusion model, and can also deploy the stable diffusion model to the local terminal to avoid unnecessary consumption of remote data transmission.
参考文献
|
[1]
|
毛毅. 人工智能研究热点及其发展方向[J]. 技术与市场, 2008(3): 4.
|
|
[2]
|
魏三强. Unity3D与原生代码交互技术在AR开发中的应用[J]. 重庆理工大学学报(自然科学版), 2017(11): 166-171.
|
|
[3]
|
梁美玉. 阵列波导透视式AR眼镜光学系统设计[J]. 长春工程学院学报(自然科学版), 2019, 20(1): 121-123.
|
|
[4]
|
Gross, H. (2008) Handbook of Optical Systems. Wiley-VCH, Weinheim. [Google Scholar] [CrossRef]
|
|
[5]
|
尤哈尼∙帕拉斯玛. 肌肤之目——视觉与感官[M]. 北京: 中国建筑工业出版社, 2008.
|
|
[6]
|
蔡建奇. 3D显示技术发展和偏光式眼镜测试方法研究[J]. 中国眼镜科技杂志, 2012: 7.
http://xueshu.baidu.com/usercenter/paper/show?paperid=ad02cfa8aa3bcd5298267a2c2d5d089f&site=xueshu_se
|
|
[7]
|
Zeng, K., Shi, X., Tang, C., Liu, T. and Peng, H. (2023) Design, Fabrication and Assembly Considerations for Electronic Systems Made of Fibre Devices. Nature Reviews Materials, 8, 552-561. [Google Scholar] [CrossRef]
|
|
[8]
|
张泽立. 云服务器4G传输模块的比较研究[J]. 计算机械工程学报, 2015, 38(2): 370-402.
|
|
[9]
|
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. and Ganguli, S. (2015) Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. International Conference on Machine Learning, Edinburgh, 26 June-1 July 2012, 2256-2265.
|
|
[10]
|
Purwins, H., Li, B., Virtanen, T., Schluter, J., Chang, S.-Y., and Sainath, T. (2019) Deep Learning for Audio Signal Processing. IEEE Journal of Selected Topics in Signal Processing, 13, 206-219. [Google Scholar] [CrossRef]
|
|
[11]
|
高俊岭, 陈志飞, 章佩佩. 基于FPGA的实时视频图像采集处理系统设计[J]. 电子技术应用, 2018, 44(2): 10-12, 19.
|