|
[1]
|
Lowe, D.G. (2004) Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60, 91-110. [Google Scholar] [CrossRef]
|
|
[2]
|
Dalal, N. and Triggs, B. (2005) Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886-893.
|
|
[3]
|
Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef]
|
|
[4]
|
Girshick, R. (2015) Fast R-CNN. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 1440-1448. [Google Scholar] [CrossRef]
|
|
[5]
|
Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. [Google Scholar] [CrossRef]
|
|
[6]
|
Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef]
|
|
[7]
|
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y. and Berg, A.C. (2016) SSD: Single Shot MultiBox Detector. Proceedings of the European Conference on Computer Vision (ECCV), Springer, Cham., 21-37. [Google Scholar] [CrossRef]
|
|
[8]
|
Lin, T.-Y., Goyal, P., Girshick, R., He, K. and Dollár, P. (2017) Focal Loss for Dense Object Detection. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2999-3007. [Google Scholar] [CrossRef]
|
|
[9]
|
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Neural Information Processing Systems (NIPS), Long Beach, CA, 4-9 December 2017, 5998-6008. [Google Scholar] [CrossRef]
|
|
[10]
|
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2021) An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations (lCLR), Vienna, 3-7 May 2021. [Google Scholar] [CrossRef]
|
|
[11]
|
Liu, Z., et al. (2021) Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, 10-17 October 2021, 9992-10002. [Google Scholar] [CrossRef]
|
|
[12]
|
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef]
|