基于仿射变换和梯度细化的对抗样本生成方法
Generating Adversarial Example Based on Affine Transformation and Gradient Refining
DOI: 10.12677/CSA.2023.139178, PDF,    科研立项经费支持
作者: 王卓:广东工业大学计算机学院,广东 广州
关键词: 对抗样本黑盒攻击图像仿射变换梯度细化可迁移性Adversarial Example Black-Box-Attack Affine Transformation Gradient Refining Transferability
摘要: 尽管白盒攻击已实现了较高的攻击成功率,但样本的过拟合现象,使得生成的对抗样本在攻击其它分类模型时成功率偏低。为缓解过拟合现象以提高对抗样本的迁移性,增加其在黑盒条件下的攻击成功率,本文提出了一种基于仿射变换和梯度细化的对抗样本生成方法AF-R-MI-FGSM。该方法不是仅使用原始图像生成对抗样本,而是在每次迭代时对输入图像进行随机的仿射变换来提高输入图像的多样性,利用数据增强技术来缓解对抗样本的过拟合现象,使得对抗样本更具有迁移性。由于引入图像随机变换导致噪声梯度随机性增加,影响攻击性能,本文提出了一种梯度细化的方式来缓解消极的梯度影响。此外,还通过使用集成模型来进一步提高样本的迁移性。并在ImageNet数据集上进行了实验,验证了本文方法的有效性,在黑盒条件下,与MI-FGSM相比,本文所提方法的单模型攻击的平均攻击成功率提升了14.3%,集成模型攻击的平均攻击成功率提升了22.1%。
Abstract: Although the white-box attack has achieved a high rate of attack success, the over-fitting phenomenon of samples makes the generated adversarial samples have a low success rate when attacking other classification models. Therefore, it is necessary to alleviate the over-fitting phenomenon to improve the migration of the adversarial samples, so as to enhance its attack performance under the condition of black-box. Therefore, it is necessary to improve the migration of adversarial samples to enhance their attack performance under black-box conditions. To solve this problem, this paper proposes a method of generating adversarial example based on affine transformation and gradient refining, AF-R-MI-FGSM. This method does not only use the original image to generate adversarial example, but performs random affine transformation on the input image at each iteration to improve the diversity of the input image, and uses data enhancement technology to alleviate the over-fitting phenomenon of adversarial example, so as to improve the attack success rate of adversarial example under black-box conditions. Because the introduction of image random transformation leads to the increase of noise gradient randomness and affects the attack performance, this paper proposes a gradient thinning method to alleviate the negative gradient effect. In addition, the migration of samples is improved by attacking the integration model. Experiments are carried out on ImageNet datasets to verify the significance of the proposed method. Compared with MI-FGSM, the average black-box attack success rate of AF-R-MI-FGSM in attacking a single model is increased by 14.3%, and the average black-box attack success rate of attack integration model is increased by 22.1%.
文章引用:王卓. 基于仿射变换和梯度细化的对抗样本生成方法[J]. 计算机科学与应用, 2023, 13(9): 1796-1805. https://doi.org/10.12677/CSA.2023.139178

参考文献

[1] Krizhevsky, A., Sutskever, I. and Hinton, G. (2012) ImageNet Classification with Deep Convolutional Neural Networks. 26th Annual Conference on Neural Information Processing Systems 2012, Lake Tahoe, 3-6 December 2012, 110-117.
[2] 陈晓楠, 胡建敏, 张本俊, 等. 基于模型间迁移性的黑盒对抗攻击起点提升方法[J]. 计算机工程, 2021, 47(8): 162-169.
[3] Madry, A., Makelov, A., Schmidt, L., Tsipras, D. and Vladu, A. (2017) Towards Deep Learning Models Resistant to Adversarial Attacks.
[4] Goodfellow, I.J., Shlens, J. and Szegedy, C. (2018) Explaining and Harnessing Adversarial Examples. https://arxiv.org/abs/1412.6572
[5] Kurakin, A., Goodfellow, I. and Bengio, S. (2017) Adversarial Examples in the Physical World. https://arxiv.org/abs/1607.02533v4
[6] Dong, Y.P., Liao, F.Z., et al. (2018) Boosting Adversarial Attacks with Momentum. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 9185-9193. [Google Scholar] [CrossRef
[7] Liu, Y.P., Chen, X.Y., Liu, C. and Song, D. (2021) Delving into Transferable Adversarial Examples and Black-Box Attacks.
https://arxiv.org/abs/1611.02770
[8] Carlini, N. and Wagner, D. (2017) Towards Evaluating the Robustness of Neural Networks. 2017 IEEE Symposium on Security and Privacy (SP), San Jose, 22-26 May 2017, 39-57. [Google Scholar] [CrossRef
[9] Wang, X.S., He, X.R., Wang, J.D., et al. (2021) Admix: Enhancing the Transferability of Adversarial Attacks. http://arxiv.org/abs/2102.00436V3
[10] Tramer, F., Kurakin, A., Papernot, N., et al. (2020) Ensemble Adversarial Training: Attacks and Defenses. https://arxiv.org/abs/1705.07204v5
[11] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. and Fergus, R. (2014) Intriguing Properties of Neural Networks. International Conference on Learning Representations.
[12] Wang, G., Yan, H., Guo, Y., et al. (2021) Improving Adversarial Transferability with Gradient Refining. Computer Vision and Pattern Recognition.
[13] Guo, C., Rana, M., Cisse, M., et al. (2018) Countering Adversarial Images Using Input Transformations. https://arxiv.org/abs/1711.00117v3
[14] Xie, C., Wang, J., Zhang, Z., et al. (2017) Mitigating Adversarial Effects through Randomization. ICLR 2018 Conference Track. 6th International Conference on Learning Representations, Vancouver, 30 April-3 May 2018, 1-16.
[15] Prakash, A., Moran, N., Garber, S., et al. (2018) Deflecting Adversarial Attacks with Pixel Deflection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 8571-8580. [Google Scholar] [CrossRef
[16] 杨博, 张恒巍, 李哲铭, 等. 基于图像翻转变换的对抗样本生成算法[J]. 计算机应用, 2022, 42(8): 2319-2325.
[17] Szegedy, C., Vanhoucke, V., Loffe, S., et al. (2016) Rethinking the Inception Architecture for Computer Vision. Proceedings of 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 2818-2826. [Google Scholar] [CrossRef
[18] Szegedy, C., Loffe, S., Vanhoucke, V., et al. (2017) Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31, 4278-4284. [Google Scholar] [CrossRef
[19] He, K.M., Zhang, X.Y., Ren, S.Q., et al. (2016) Identity Mappings in Deep Residual Networks. Proceedings of the 2016 European Conference on Computer Vision, Amsterdam, 11-14 October 2016, 630-645. [Google Scholar] [CrossRef