改进YOLO11n的疲劳驾驶目标检测方法
Improved YOLO11n Fatigue Driving Target Detection Method
DOI: 10.12677/sea.2025.145092, PDF,   
作者: 王业童, 张丽艳*:大连交通大学轨道智能工程学院电子通信系,辽宁 大连
关键词: 疲劳驾驶YOLO11n自注意力机制细节增强卷积Fatigue Driving YOLO11n Self Attention Mechanism Detail Enhanced Convolution
摘要: 近年来,随着汽车数量不断增多,因疲劳驾驶导致的交通事故频发。针对当前目标检测算法准确率不足、鲁棒性差、尺寸大等问题,本文基于YOLO11n提出一种用于疲劳驾驶分析的高效面部状态检测模型。本文首先针对因眼部、嘴部等小目标容易受到分辨率、角度偏移、遮挡等因素的影响,在C3K2特征提取单元中引入非对称填充的风车形卷积PSConv,使模块在提取特征时能够捕捉更广域的上下文信息;其次,针对原始模型C3PSA中自注意力机制消耗较高的计算资源问题,引入基于统计学的新型注意力算子TSSA,通过对token特征的统计分析来有效捕捉特征、精准聚焦目标区域,同时引入Dynamic Tanh作为注意力机制中的归一化层,在无需多余计算资源的基础上为模型添加非线性归一化操作,提升检测精度、减小模型尺寸、提高模型鲁棒性;最后,为进一步降低模型参数量,引入轻量级共享卷积头,并在此基础上集成细节增强卷积,补偿模型检测精度。本文在公开疲劳驾驶数据集上进行有效性验证实验,相较于基线模型,改进模型在检测准确率方面,mAP50提升1个百分点、mAP50-95提升4.1个百分点;模型参数量降低近23%;帧率提升7帧;在模块改进层面完成了轻量化和检测精度水平的优化任务,可以为进一步疲劳驾驶研判提供高精度的特征信息。
Abstract: In recent years, with the increasing number of cars, traffic accidents caused by fatigue driving have become frequent. In response to the problems of insufficient accuracy, poor robustness, and large size of current object detection algorithms, this paper proposes an efficient facial state detection model for fatigue driving analysis based on YOLO11n. This article first addresses the impact of factors such as resolution, angle offset, and occlusion on small targets such as the eyes and mouth. In the C3K2 feature extraction unit, an asymmetric filled windmill shaped convolution PSConv is introduced to enable the module to capture a wider range of contextual information when extracting features; Secondly, in response to the high computational resource consumption of the self attention mechanism in the original model C3PSA, a new attention operator TSSA based on statistics is introduced to effectively capture features and accurately focus on the target area through statistical analysis of token features. At the same time, Dynamic Tanh is introduced as the normalization layer in the attention mechanism, which adds nonlinear normalization operations to the model without unnecessary computational resources, improves detection accuracy, reduces model size, and enhances model robustness; Finally, to further reduce the number of model parameters, a lightweight shared convolution head is introduced, and on this basis, detail enhanced convolution is integrated to compensate for model detection accuracy. This article conducted effectiveness verification experiments on a publicly available fatigue driving dataset. Compared to the baseline model, the improved model improved detection accuracy by 1 percentage point for mAP50 and 4.1 percentage points for mAP50-95; The number of model parameters decreased by nearly 23%; Frame rate increased by 7 frames; The optimization tasks of lightweighting and detection accuracy have been completed at the module improvement level, which can provide high-precision feature information for further fatigue driving analysis.
文章引用:王业童, 张丽艳. 改进YOLO11n的疲劳驾驶目标检测方法[J]. 软件工程与应用, 2025, 14(5): 1035-1044. https://doi.org/10.12677/sea.2025.145092

参考文献

[1] 张育榕, 谷昆, 张轩雄. 基于神经网络的疲劳驾驶检测方法研究[J]. 理论数学, 2023, 13(5): 1298-1314.
[2] Benmohamed, A. and Zarzour, H. (2024) A Deep Learning-Based System for Driver Fatigue Detection. Ingénierie des systèmes dinformation, 29, 1779-1788. [Google Scholar] [CrossRef
[3] 杜威, 宁武, 孟丽囡, 等. 基于改进YOLO的矿卡驾驶员疲劳检测算法[J]. 现代电子技术, 2025, 48(7): 126-131.
[4] Yin, L.F. and Ding, Z.Y. (2024) Lightweight Research on Fatigue Driving Face Detection Based on YOLOv8. Recent Advances in Computer Science and Communications, 19.
[5] Khanam, R. and Hussain, M. (2024) Yolov11: An Overview of the Key Architectural Enhancements.
[6] Yang, J., Liu, S., Wu, J., Su, X., Hai, N. and Huang, X. (2025) Pinwheel-Shaped Convolution and Scale-Based Dynamic Loss for Infrared Small Target Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 39, 9202-9210. [Google Scholar] [CrossRef
[7] Wu, Z., Ding, T., Lu, Y., et al. (2024) Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction.
[8] Zhu, J., Chen, X., He, K., LeCun, Y. and Liu, Z. (2025) Transformers without Normalization. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 10-17 June 2025, 14901-14911. [Google Scholar] [CrossRef
[9] 李军, 周科宇, 邹军, 等. 基于改进YOLOv8n的施工场景下防护装备佩戴检测算法[J]. 郑州大学学报(工学版), 2025, 46(3): 19-25+104.
[10] Chen, Z., He, Z. and Lu, Z. (2024) Dea-Net: Single Image Dehazing Based on Detail-Enhanced Convolution and Content-Guided Attention. IEEE Transactions on Image Processing, 33, 1002-1015. [Google Scholar] [CrossRef] [PubMed]
[11] Omidyeganeh, M., Shirmohammadi, S., Abtahi, S., Khurshid, A., Farhan, M., Scharcanski, J., et al. (2016) Yawning Detection Using Embedded Smart Cameras. IEEE Transactions on Instrumentation and Measurement, 65, 570-582. [Google Scholar] [CrossRef
[12] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., et al. (2016) SSD: Single Shot Multibox Detector. In: Lecture Notes in Computer Science, Springer, 21-37. [Google Scholar] [CrossRef
[13] Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., Kwon, Y., et al. (2022) Ultralytics/YOLOv5: v6. 2-Yolov5 Classification Models, Apple M1, Reproducibility, ClearML and Deci.ai Integrations.
[14] Yaseen, M. (2024) What Is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector.
[15] Wang, C.Y., Yeh, I.H. and Mark Liao, H.Y. (2024) YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In: Lecture Notes in Computer Science, Springer, 1-21. [Google Scholar] [CrossRef
[16] Wang, A., Chen, H., Liu, L., et al. (2024) YOLOV10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984-108011.
[17] Tian, Y., Ye, Q. and Doermann, D. (2025) YOLOV12: Attention-Centric Real-Time Object Detectors.