基于语义感知的图像压缩算法研究
Research on Semantic-Aware Image Compression Algorithms
DOI: 10.12677/csa.2024.1412241, PDF,    科研立项经费支持
作者: 宋媛萌, 贾正正*, 贾召弟, 王宇辰, 韩卓航:北华航天工业学院计算机学院,河北 廊坊;杨少华:北华航天工业学院航空宇航学院,河北 廊坊
关键词: 语义感知网络VAE图像压缩深度学习Semantic-Aware Networks VAE Image Compression Deep Learning
摘要: 图像压缩的目的是尽量保持图像质量的前提下减少图像数据的存储空间。传统的图像压缩方法主要依赖于对图像像素进行编码和量化,无法利用图像中高级语义信息。本文提出了一种基于语义感知的图像压缩算法,具体步骤为:首先,通过卷积神经网络对图像进行语义分析,然后,通过语义感知模块提取出图像的语义级别,即主要语义区域和次要语义区域。最后,将语义级别带入VAE图像压缩网络模型中,根据语义区域信息,对图像中的主要语义区域进行轻度压缩,而对次要区域进行更大幅度的压缩,以确保在减少文件体积的同时,最大程度地保留图像中的关键信息和视觉质量。在Kodak等公开数据集上进行实验,实验表明基于语义感知的图像压缩算法在提供更好的图像质量方面具有显著优势。
Abstract: The goal of image compression is to reduce the storage space of image data while preserving image quality as much as possible. Traditional image compression methods primarily rely on encoding and quantizing image pixels, and are unable to leverage advanced semantic information within the image. This paper proposes a semantic-aware image compression algorithm, with the following steps: First, a convolutional neural network is used to perform semantic analysis on the image. Then, a semantic-aware module extracts the semantic levels of the image, identifying primary and secondary semantic regions. Finally, these semantic levels are incorporated into a VAE image compression network model, where the primary semantic regions of the image are lightly compressed, and the secondary regions are more heavily compressed, ensuring that while reducing file size, the key information and visual quality within the image are retained to the greatest extent possible. Experiments conducted on public datasets such as Kodak demonstrate that the semantic-aware image compression algorithm has a significant advantage in providing superior image quality.
文章引用:宋媛萌, 贾正正, 贾召弟, 王宇辰, 杨少华, 韩卓航. 基于语义感知的图像压缩算法研究[J]. 计算机科学与应用, 2024, 14(12): 67-75. https://doi.org/10.12677/csa.2024.1412241

参考文献

[1] Wallace, G.K. (1991) The JPEG Still Picture Compression Standard. Communications of the ACM, 34, 30-44. [Google Scholar] [CrossRef
[2] Rabbani, M. and Joshi, R. (2002) An Overview of the JPEG 2000 Still Image Compression Standard. Signal Processing: Image Communication, 17, 3-48. [Google Scholar] [CrossRef
[3] Perra, C., Pes, P.A. and Giusto, D.D. (2011) High-Frequency Error Recovery in JPEG XR Coded Images. 2011 18th IEEE International Conference on Image Processing, Brussels, 11-14 September 2011, 2217-2220. [Google Scholar] [CrossRef
[4] Ginesu, G., Pintus, M. and Giusto, D.D. (2012) Objective Assessment of the WebP Image Coding Algorithm. Signal Processing: Image Communication, 27, 867-874. [Google Scholar] [CrossRef
[5] Yee, D., Soltaninejad, S., Hazarika, D., Mbuyi, G., Barnwal, R. and Basu, A. (2017) Medical Image Compression Based on Region of Interest Using Better Portable Graphics (BPG). 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, 5-8 October 2017, 216-221. [Google Scholar] [CrossRef
[6] Cheng, Z., Sun, H., Takeuchi, M. and Katto, J. (2020) Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 7936-7945. [Google Scholar] [CrossRef
[7] Huang, C.-H. and Wu, J.-L. (2024) Exploring Compressed Image Representation as a Perceptual Proxy: A Study. arXiv:2401.07200. [Google Scholar] [CrossRef
[8] Arnavut, Z. (1999) Lossless Compression of Color-Mapped Images. International Conference on High Capacity Optical Networks & Enabling Technologies, 38, 1001-1005. [Google Scholar] [CrossRef
[9] Sneyers, J. and Wuille, P. (2016) FLIF: Free Lossless Image Format Based on MANIAC Compression. 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, 25-28 September 2016, 66-70. [Google Scholar] [CrossRef
[10] Prakash, A., Moran, N., Garber, S., Dilillo, A. and Storer, J. (2017) Semantic Perceptual Image Compression Using Deep Convolution Networks. 2017 Data Compression Conference (DCC), Snowbird, 4-7 April 2017, 250-259. [Google Scholar] [CrossRef
[11] Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R. and Van Gool, L. (2019) Generative Adversarial Networks for Extreme Learned Image Compression. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 221-231. [Google Scholar] [CrossRef
[12] Peng, P. and Li, Z. (2011) Self-Information Weighting for Image Quality Assessment. 2011 4th International Congress on Image and Signal Processing, Shanghai, 15-17 October 2011, 1728-1732. [Google Scholar] [CrossRef
[13] Kingma, D.P. and Welling, M. (2014) Auto-Encoding Variational Bayes. [Google Scholar] [CrossRef
[14] Rabbani, M. (2002) JPEG2000: Image Compression Fundamentals, Standards and Practice. Journal of Electronic Imaging, 11, 286. [Google Scholar] [CrossRef
[15] Nicol, A., Andrea, G. (2014) TESTIMAGES: A Large-Scale Archive for Testing Visual Devices and Basic Image Processing Algorithms. The Eurographics Association. [Google Scholar] [CrossRef
[16] Akbari, M., Liang, J. and Han, J. (2019) DSSLIC: Deep Semantic Segmentation-Based Layered Image Compression. ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, 12-17 May 2019, 2042-2046. [Google Scholar] [CrossRef
[17] Hoang, T.M., Zhou, J. and Fan, Y. (2020) Image Compression with Encoder-Decoder Matched Semantic Segmentation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, 14-19 June 2020, 619-623. [Google Scholar] [CrossRef
[18] Luo, S., Yang, Y., Yin, Y., Shen, C., Zhao, Y. and Song, M. (2018) DeepSIC: Deep Semantic Image Compression. Neural Information Processing, Siem Reap, 13-16 December 2018, 96-106. [Google Scholar] [CrossRef