人工智能安全——对抗攻击分析
Artificial Intelligence Security—Analysis on Adversarial Attacks
DOI: 10.12677/CSA.2019.912249, PDF,  被引量   
作者: 易楷凡*:上海交通大学附属中学,上海;上海交通大学网络空间安全实践工作站,上海;邵 倩:上海交通大学附属中学,上海;陈 敏:上海交通大学网络空间安全实践工作站,上海
关键词: 人工智能安全深度学习对抗攻击Artificial Intelligence Security Deep Learning Adversarial Attacks
摘要: 随着人工智能的迅速发展及其广泛应用,人工智能安全也开始引起人们的关注,攻击者在正常样本中增加了细微的扰动,导致人工智能深度学习模型分类判断出现错误,这种行为称为对抗样本攻击。该文综述对抗样本攻击的研究现状,研究了对抗样本攻击的经典算法:FGSM、DeepFool、JSMA、CW,分析了这几种经典对抗算法的生成对抗样本的效率及其对深度学习模型的误导效果,为对抗样本检测和防御算法设计提供理论指导。
Abstract: With the rapid development of artificial intelligence and its wide application, artificial intelligence security has also begun to attract people’s attention. Attackers have added subtle disturbances in normal samples, resulting in errors in the classification and judgment of artificial intelligence deep learning models. It is called adversarial sample attacks. This paper reviews the research status of adversarial sample attacks, and studies the classic algorithms on adversarial sample attacks: FGSM, DeepFool, JSMA, CW. And the paper analyzes the efficiency of these classic attack algorithms and their misleading effect on deep learning model, in order to provide theoretical guidance for the design of adversarial sample detection and defense algorithms.
文章引用:易楷凡, 邵倩, 陈敏. 人工智能安全——对抗攻击分析[J]. 计算机科学与应用, 2019, 9(12): 2239-2248. https://doi.org/10.12677/CSA.2019.912249

参考文献

[1] Goodfellow, I., Yoshua, B. and Aaron, C. (2016) Deep Learning. MIT Press, Boston.
[2] Webb, S. (2018) Deep Learning for Biology. Nature, 554, 555-557. [Google Scholar] [CrossRef] [PubMed]
[3] Branson, K. (2018) A Deep (Learning) Dive into a Cell. Nature Methods, 15, 253-254. [Google Scholar] [CrossRef] [PubMed]
[4] Deng, Y., Bao, F., Kong, Y.Y., et al. (2017) Deep Direct Reinforcement Learning for Financial Signal Representation and Trading. IEEE Transactions on Neural Networks and Learning Sys-tems, 28, 653-664. [Google Scholar] [CrossRef
[5] He, Y., Zhao, N. and Yin, H.X. (2018) Integrated Networking, Caching, and Computing for Connected Vehicles: A Deep Reinforcement Learning Approach. IEEE Transactions on Ve-hicular Technology, 67, 44-55. [Google Scholar] [CrossRef
[6] Goodfellow, I., Shlens, J. and Christian, S. (2015) Explaining and Harnessing Adversarial Examples.
https://arxiv.org/abs/1412.6572
[7] Thys, S., Van Ranst, W. and Goedemé, T. (2019) Fooling Automated Sur-veillance Cameras: Adversarial Patches to Attack Person Detection.
https://arxiv.org/pdf/1904.08653.pdf
[8] Tencent Keen Security Lab. (2019) Experimental Security Research of Tesla Autopi-lot.
[9] https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf.
[10] Papernot, N., Mcdaniel, P., Goodfellow, I., et al. (2016) Practical Black-Box Attacks against Machine Learning.
https://arxiv.org/abs/1602.02697
[11] Kurakin, A., Goodfellow, I. and Bengio, S. (2018) Adversarial Examples in the Physical World.
https://arxiv.org/abs/1805.10997
[12] Huang, S., Papernot, N., Goodfellow, I., Duany, Y. and Abbeel, P. (2017) Adversarial Attacks on Neural Network Policies.
https://arxiv.org/abs/1702.02284v1
[13] Tramer, F., Goodfellow, I., Boneh, D., et al. (2017) Ensemble Adversarial Training: Attacks and Defenses.
https://arxiv.org/abs/1705.07204
[14] Moosavidezfooli, S., Fawzi, A. and Frossard, P. (2015) DeepFool: A Sim-ple and Accurate Method to Fool Deep Neural Networks.
https://arxiv.org/abs/1511.04599
[15] Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., BerkayCelik, Z. and Swami, A. (2016) The Limitations of Deep Learning in Ad-versarial Settings. IEEE European Symposium on Security and Privacy, Saarbrücken, 21-24 March 2016, 372-387. [Google Scholar] [CrossRef
[16] Nicholas, D.W. (2017) Towards Evaluating the Robustness of Neural Networks.
https://arxiv.org/pdf/1608.04644.pdf
[17] Baidu xlab. AdvBox.
https://github.com/baidu/AdvBox
[18] Stanford Vision Lab. ImageNet. http://www.image-net.org
[19] Fawzi, A., Fawzi, O. and Frossard, P. (2015) Fundamental Limits on Adversarial Robustness. http://www.alhusseinfawzi.info/papers/workshop_dl.pdf
[20] Guo, C., Rana, M., Cisse, M. and Maaten, L. (2018) Countering Adversarial Images Using Input Transformations.
https://arxiv.org/abs/1711.00117