基于图像的数据增强方法发展现状综述
A Survey on the Development of Image Data Augmentation
DOI: 10.12677/CSA.2021.112037, PDF,  被引量   
作者: 冯晓硕:海军研究院,北京;沈 樾, 王冬琦*:东北大学软件学院,辽宁 沈阳
关键词: 数据增强图像数据集图像处理深度学习Data Augmentation Image Dataset Image Processing Deep Learning
摘要: 现阶段,基于深度学习的图像处理和识别技术已经发展的十分成熟,但在某些图像识别任务中由于深度学习技术的特点,一些深度神经网络模型层数较多导致的学习能力较强,将图像数据样本中的特征学习的过于充分,使得神经网络模型在训练数据上出现过拟合现象。同时,基于深度学习的图像处理算法训练的模型的好坏与数据集的质量、规模息息相关,但由于客观原因存在获得的图像数据集小、图像质量差,样本分布不均衡等现象。针对上述问题,研究人员提出通过使用图像数据增强技术实现对模型的输入数据的规模、质量和分布情况进行优化,将数据增强后的数据集用于深度学习模型将有效降低出现过拟合现象的概率。本文的主要工作是对现有的图像数据增强技术进行讨论,从传统图像处理方法和基于深度学习数据增强方法两方面进行梳理总结,其中传统图像处理方法有几何变换、颜色变换和像素变换;基于机器学习的图像数据增强方法有自动数据增强方法、基于生成对抗网络数据增强方法和基于自动编码器和生成对抗网络组合的数据增强方法。本文着重对图像融合、信息删除以及基于生成对抗网络的图像数据增强方法等技术进行介绍,并且对文中提出的数据增强方法的思想及其优缺点进行讨论,为研究人员在不同图像任务中利用对应的数据增强方法来优化数据集从而提高模型准确率提供研究思路。
Abstract: Image processing and recognition technology based on deep learning has developed very well. However, in some image recognition tasks, due to the characteristics of deep learning models, some deep neural network models have strong learning ability due to the large number of layers, and the features in the images are learned too fully, which makes the neural network model appear fitting phenomenon on the training data. At the same time, the quality of the model trained by the image processing algorithm based on deep learning is closely related to the quality and scale of the dataset. However, due to the small dataset, poor image quality and unbalanced sample distribution, etc. In order to solve the above problems, the researchers proposed to optimize the scale, quality and distribution of the input data of the model by using image data augmentation technology. Applying the augmented dataset to the deep learning model will effectively reduce the probability of over-fitting. The main contribution of this paper is to discuss the existing image data augmentation technology, and summarize the traditional image processing methods and data augmentation methods based on deep learning, among which the traditional image processing methods include geometric, color, and pixel transformation. Image data augmentation methods based on machine learning include Auto Augment, methods based on GAN and methods based on combination of AE and GAN. In this paper, the technologies of image fusion, information deletion and image data augmentation method based on GAN are introduced, and the ideas, advantages and disadvantages of the data augmentation methods proposed in this paper are discussed, which provides ideas for researchers to optimize datasets and improve the accuracy of models by using corresponding data augmentation methods in different image tasks.
文章引用:冯晓硕, 沈樾, 王冬琦. 基于图像的数据增强方法发展现状综述[J]. 计算机科学与应用, 2021, 11(2): 370-382. https://doi.org/10.12677/CSA.2021.112037

参考文献

[1] 王和勇, 樊泓坤, 姚正安, 李成安. 不平衡数据集的分类方法研究[J]. 计算机应用研究, 2008, 25(5): 1301- 1303+1308. http://dx.chinadoi.cn/10.3969/j.issn.1001-3695.2008.05.006
[2] 张奇, 卢建斌, 刘涛, 刘齐悦. 基于CNN的舰船高分辨距离像目标识别[J]. 雷达科学与技术, 2020, 18(1): 27-33. http://dx.chinadoi.cn/10.3969/j.issn.1672-2337.2020.01.005
[3] Perez, L. and Wang, J. (2017) The Effectiveness of Data Augmentation in Image Classification using Deep Learning. arXiv: 1712.04621.
https://arxiv.org/abs/1712.04621
[4] 朱虹. 数字图像处理基础[M]. 北京: 科学出版社, 2005.
[5] Shorten, C. and Khoshgoftaar, T.M. (2019) A Survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6, Arti-cle No. 60. [Google Scholar] [CrossRef
[6] Nawara, P., Jakubowski, T. and Sobol, Z. (2019) Application of the CIE L*a*b* Method for the Evaluation of the Colour of Fried Products from Potato Tubers Exposed to C Band Ultraviolet Light. E3S Web of Conferences, 132, Article No. 02004. [Google Scholar] [CrossRef
[7] Keyvanpour, M. and Merrikh-Bayat, F. (2011) An Effective Chaos-Based Image Watermarking Scheme Using Fractal coding. Procedia Computer Science, 3, 89-95. [Google Scholar] [CrossRef
[8] 霍宏涛. 数字图像处理[M]. 北京: 机械工业出版社, 2003.
[9] Lu, Z., Jiang, X. and Kot, A. (2017) Enhance Deep Learning Performance in Face Recognition. Proceedings of 2017 2nd International Conference on Image, Vision and Computing, Chengdu, 2-4 June 2017, 244-248. [Google Scholar] [CrossRef
[10] Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002) SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. [Google Scholar] [CrossRef
[11] 杜金华. 基于颜色特征和逻辑回归的饰面花岗石图像识别技术研究[D]: [硕士学位论文]. 泉州: 华侨大学, 2018.
[12] Zhang, H., Cisse, M., Dauphin, Y. and Lopez-Paz, D. (2017) Mixup: Beyond Empirical Risk Minimization. arXiv: 1710.09412.
https://arxiv.org/abs/1710.09412
[13] Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J. and Yoo, Y. (2019) CutMix: Regularization Strategy to Train Strong Classifiers with Local-izable Features. Proceedings of International Conference on Computer Vision, Seoul, 27 October-2 November 2019, 6022-6031. [Google Scholar] [CrossRef
[14] Inoue, H. (2018) Data Augmentation by Pairing Sam-ples for Images Classification. arXiv: 1801.02929.
https://arxiv.org/abs/1801.02929
[15] Zhong, Z., Zheng, L., Kang, G., Li, S. and Yang, Y. (2017) Random Erasing Data Augmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 13001-13008. [Google Scholar] [CrossRef
[16] Devries, T. and Taylor, G.W. (2017) Improved Regularization of Convolutional Neural Networks with Cutout. arXiv: 1708.04552.
https://arxiv.org/abs/1708.04552
[17] 蒋芸, 张海, 陈莉, 陶生鑫. 基于卷积神经网络的图像数据增强算法[J]. 计算机工程与科学, 2019, 41(11): 2007- 2016. http://dx.chinadoi.cn/10.3969/j.issn.1007-130X.2019.11.015
[18] Chen, P., Liu, S., Zhao, H. and Jia, J. (2020) Grid-Mask Data Augmentation. arXiv: 2001.04086.
https://arxiv.org/abs/2001.04086
[19] Cubuk, E., Zoph, B., Mane, D., Vasudevan, V. and Le, Q. (2019) Auto-Augment: Learning Augmentation Policies from Data. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 113- 123. [Google Scholar] [CrossRef
[20] Cubuk, E.D., Zoph, B., Shlens, J. and Le, Q.V. (2020) Randaugment: Practical Automated Data Augmentation with a Reduced Search Space. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, 14-19 June 2020, 3008-3017. [Google Scholar] [CrossRef
[21] Li, Y., Hu, G., Wang, Y., Hospedales, T., Robertson, N. and Yang, Y. (2020) DADA: Differentiable Automatic Data Augmentation. European Conference on Computer Vision 2020, Glasgow, 23-28 August 2020, 580-595. [Google Scholar] [CrossRef
[22] Ho, D., Liang, E., Stoica, I., Abbeel, P. and Chen, X. (2019) Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules. Proceedings of the 36th Interna-tional Conference on Machine Learning, Long Beach, 10-15 June 2019, 2731-2741.
[23] Lim, S., Kim, I., Kim, T., Kim, C. and Kim, S. (2019) Fast AutoAugment. Proceedings of Advances in Neural Information Processing Systems, Van-couver, 8-14 December 2019, 6665-6675.
[24] Radford, A., Metz, L. and Chintala, S. (2015) Unsupervised Representa-tion Learning with Deep Convolutional Generative Adversarial Networks. arXiv: 1511.06434.
https://arxiv.org/abs/1511.06434
[25] Ioffe, S. and Szegedy, C. (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv: 1502.03167.
https://arxiv.org/abs/1502.03167
[26] Zhu, J.-Y., Park, T., Isola, P. and Efros, A. (2017) Unpaired Im-age-to-Image Translation using Cycle-Consistent Adversarial Networks. IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 2223-2232. [Google Scholar] [CrossRef
[27] Mirza, M. and Science, S.O.J.C. (2014) Conditional Generative Ad-versarial Nets. arXiv: 1411.1784.
https://arxiv.org/abs/1411.1784
[28] Bao, J., Chen, D., Wen, F., Li, H. and Hua, G. (2017) CVAE-GAN: Fi-ne-Grained Image Generation through Asymmetric Training. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 2764-2773. [Google Scholar] [CrossRef
[29] He, Y., Schiele, B. and Fritz, M. (2018) Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes. Proceedings of 15th European Conference, Munich, 8-14 September 2018, 422-437. [Google Scholar] [CrossRef
[30] Chen, Q. and Koltun, V. (2017) Photographic Image Synthe-sis with Cascaded Refinement Networks. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 1520-1529. [Google Scholar] [CrossRef
[31] Johnson, J., Alahi, A. and Li, F.F. (2016) Perceptual Losses for Re-al-Time Style Transfer and Super-Resolution. 2016 European Conference on Computer Vision, Amsterdam, 8-16 Octo-ber, 694-711. [Google Scholar] [CrossRef