基于一种卷积神经式类网络的实时人脸识别方法研究
Study on Convolutional Neural Networks Cascade for Real-time Face Recognition Methods
DOI: 10.12677/CSA.2020.101002, PDF,  被引量    科研立项经费支持
作者: 徐建亮*, 周明安, 毛建辉, 方坤礼:衢州职业技术学院,浙江 衢州
关键词: 人脸检测人脸识别类神经网络串联式分类器Face Detection Face Recognition Neural Network Adaboost Classifier
摘要: 近年来,人脸识别技术于近年来快速发展且逐渐受到重视。若以传统的特征获取方法达到人脸识别,如局部二值模式(Local Binary Pattern, LBP),需搜集大量训练数据,且其训练模型大小容易受数据量影响,使该人脸识别系统计算时间延长,影响使用效率。本文提出的三阶段串联式人脸检测器以及基于最近特征空间转换(Nearest Feature Line, NFL)的人脸识别方法,采用卷积式类神经网络(Convolutional Neural Network, CNN)所产生的特征映像(Feature Map, FM)作为串联式分类器(Adaboost)的输入数据。通过实验结果分析,提出的方法能利用较少的弱分类器建置出一个检测速度快的人脸检测器。在人脸识别方面,本文提出一种最近特征空间转换,该方法是以点到点、点到线、点到面等最近距离的向量作为共变异数计算的基础,其所构成的共变异数矩阵具有较佳的特征空间,即此特征空间较传统点到点所求得的特征空间让投影到其中的样本点更具有一般性与代表性。该方法被应用于实际检测,适应性和精度都较好。
Abstract: Facial recognition technology has received extensive attention and massively developed in recent years. Facial recognizing with a traditional feature extractor, such as Local Binary Pattern (LBP), leads to a strong demand for massive training data. The size of the training model is easily affected by the amount of data, which prolongs the computing time of the face recognition system and affects its efficiency. A three-stage tandem face detector and a face recognition method based on the Nearest Feature Line (NFL) are proposed. Feature maps (FM) generated by the Convolutional Neural Network (CNN) is used as the input data for the tandem classifier (Adaboost). Through the analysis of experimental results, the proposed method can build a fast face detector with fewer weak classifiers. In face recognition, a recent feature space transformation is proposed. This method uses point-to-point, point-to-line, and point-to-surface vectors as the basis for covariance calculations. The matrix has a better feature space, that is, this feature space is more general and representative than the feature space obtained from the traditional point-to-point feature space. The method was applied in the actual detection, and its adaptability and accuracy are better.
文章引用:徐建亮, 周明安, 毛建辉, 方坤礼. 基于一种卷积神经式类网络的实时人脸识别方法研究[J]. 计算机科学与应用, 2020, 10(1): 11-20. https://doi.org/10.12677/CSA.2020.101002

参考文献

[1] 黄明祥, 林咏章. 信息与网络安全概论[M]. 第3版. 北京: 清华大学出版社, 2010.
[2] LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W. and Jackel, L.D. (1989) Backpropagation Applied to Hand-written Zip Code Recognition. Neural Computation, 1, 541-551. [Google Scholar] [CrossRef
[3] LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradi-ent-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. [Google Scholar] [CrossRef
[4] Sutskever, K.I. and Hinton, G. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25, 1106-1114.
[5] Szegedy, C., Liu, W., Jia, Y.Q., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A. (2015) Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 7-12 June 2015, 1-9. [Google Scholar] [CrossRef
[6] Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science, 1-14.
[7] He, K.M., Zhang, X.Y., Ren, S.Q. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 27-30 June 2016, 770-778.
[8] Fukushima, K. (1980) Neo-cognitron: A Self-Organizing Neural Network Model for Amechanism of Pattern Recognition Unaffected by Shift in Po-sition. Biological Cybernetics, 36, 93-202. [Google Scholar] [CrossRef
[9] Oeschoten, M.A., Kenemans, J.L., van Engeland, H. and Kemner, C. (2007) Face Processing in Pervasive Developmental Disorder (PDD): The Roles of Expertise and Spatial Frequency. Journal of Neural Transmission, 114, 1619-1629. [Google Scholar] [CrossRef] [PubMed]
[10] Calder, A.J., Young, A.W., Keane, J. and Dean, M. (2000) Con-figural Information in Facial Expression Perception. The Journal of Experimental Psychology: Human Perception and Performance, 26, 527-551.
[11] Casco, C., Campana, G, Grieco, A. and Fuggetta, G. (2004) Perceptual Learning Mod-ulates Electrophysiological and Psychophysical Response to Visual Texture Segmentation in Humans. Neuroscience Let-ters, 371, 18-23. [Google Scholar] [CrossRef] [PubMed]
[12] Dakin, S.C., Hess, R.F., Ledgeway, T. and Achtman, R.L. (2002) What Causes Nonmonotonic Tuning of fMRI Response to Noisy Images? Current Biology, 12, R476-R477. [Google Scholar] [CrossRef
[13] Deruelle, C., Rondan, C., Gepner, B. and Tardif, C. (2004) Spatial Frequency and Face Processing in Children with Autism and Asperger Syndrome. Journal of Autism and Devel-opmental Disorders, 34, 199-210. [Google Scholar] [CrossRef
[14] Deruelle, C., Rondan, C., Salle-Collemiche, X., Bas-tard-Rosset, D. and Da Fonséca, D. (2008) Attention to Low- and High-Spatial Frequencies in Categorizing Facial Iden-tities, Emotions and Gender in Children with Autism. Brain and Cognition, 66, 115-123. [Google Scholar] [CrossRef] [PubMed]
[15] Eimer, M. (2000) The Face-Specific N170 Component Reflects Late Stages in the Structural Encoding of Faces. NeuroReport, 11, 2319-2324. [Google Scholar] [CrossRef] [PubMed]
[16] Ellemberg, D., Lewis, T.L., Maurer, D., Lui, C.H. and Brent, H.P. (1999) Spatial and Temporal Vision in Patients Treated for Bilateral Congenital Cataracts. Vision Research, 39, 3480-3489. [Google Scholar] [CrossRef