|
[1]
|
Wang, C.Y., Yeh, I.H. and Mark Liao, H.Y. (2024) YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. European Conference on Computer Vision, Milan, 29 September-4 October, 1-21.
|
|
[2]
|
Bochkovskiy, A., Wang, C.Y. and Liao, H.Y.M. (2004) YOLOv4: Optimal Speed and Accuracy of Object Detection. https://arxiv.org/abs/2004.10934
|
|
[3]
|
Ge, Z., Liu, S., Wang, F., et al. (2021) YOLOx: Exceeding YOLO Series in 2021. https://arxiv.org/abs/2107.08430
|
|
[4]
|
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A. and Zagoruyko, S. (2020) End-to-End Object Detection with Transformers. In: Lecture Notes in Computer Science, Springer, 213-229. [Google Scholar] [CrossRef]
|
|
[5]
|
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. https://arxiv.org/abs/2010.11929
|
|
[6]
|
Bao, H., Dong, L., Piao, S., et al. (2021) Beit: Bert Pre-Training of Image Transformers. https://arxiv.org/abs/2106.08254
|
|
[7]
|
Feng, C., Zhong, Y., Gao, Y., Scott, M.R. and Huang, W. (2021) TOOD: Task-Aligned One-Stage Object Detection. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 3490-3499. [Google Scholar] [CrossRef]
|
|
[8]
|
Chen, Y., Yuan, X., Wang, J., Wu, R., Li, X., Hou, Q., et al. (2025) YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47, 4240-4252. [Google Scholar] [CrossRef] [PubMed]
|
|
[9]
|
Gao, S.H., Cheng, M.M., Zhao, K., et al. (2019) Res2net: A New Multi-Scale Backbone Architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 652-662. [Google Scholar] [CrossRef] [PubMed]
|
|
[10]
|
Zhong, J., Chen, J. and Mian, A. (2023) DualConv: Dual Convolutional Kernels for Lightweight Deep Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 34, 9528-9535. [Google Scholar] [CrossRef] [PubMed]
|
|
[11]
|
Liu, Y.C., Shao, Z.R., Teng, Y.Y., et al. (2022) NAM: Normalization-Based Attention Module. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 22-23 September 2022, 12345-12354.
|
|
[12]
|
Ding, M., Xiao, B., Codella, N., Luo, P., Wang, J. and Yuan, L. (2022) Davit: Dual Attention Vision Transformers. In: Lecture Notes in Computer Science, Springer, 74-92. [Google Scholar] [CrossRef]
|
|
[13]
|
Chen, K., Lin, W., Li, J., See, J., Wang, J. and Zou, J. (2020) Ap-Loss for Accurate One-Stage Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 3782-3798. [Google Scholar] [CrossRef] [PubMed]
|
|
[14]
|
Ge, Z., Liu, S., Li, Z., Yoshie, O. and Sun, J. (2021) OTA: Optimal Transport Assignment for Object Detection. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 303-312. [Google Scholar] [CrossRef]
|