|
[1]
|
Wang, H., et al. (2011) Action Recognition by Dense Trajectories. The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, 20-25 June 2011, 3169-3176. [Google Scholar] [CrossRef]
|
|
[2]
|
Simonyan, K. and Zisserman, A. (2014) Two-Stream Convolu-tional Networks for Action Recognition in Videos. 28th Annual Conference on Neural Information Processing Systems (NIPS 2014), Montreal, 8-13 December 2014, 568-576.
|
|
[3]
|
Wang, L., et al. (2016) Temporal Segment Networks: To-wards Good Practices for Deep Action Recognition.
|
|
[4]
|
Zhou, B., et al. (2018) Temporal Relational Reasoning in Videos.
|
|
[5]
|
Tran, D., et al. (2015) Learning Spatiotemporal Features with 3D Convolutional Networks. 2015 IEEE In-ternational Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 4489-4497. [Google Scholar] [CrossRef]
|
|
[6]
|
Carreira, J. and Zisserman, A. (2017) Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 4724-4733. [Google Scholar] [CrossRef]
|
|
[7]
|
Qiu, Z.F., et al. (2017) Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 5534-5542. [Google Scholar] [CrossRef]
|
|
[8]
|
Danelljan, M., et al. (2017) ECO: Efficient Convolution Operators for Tracking. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 6931-6939. [Google Scholar] [CrossRef]
|
|
[9]
|
Feichtenhofer, C. (2020) X3D: Expanding Architectures for Efficient Video Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 200-210. [Google Scholar] [CrossRef]
|
|
[10]
|
Wu, C.-Y., et al. (2018) Compressed Video Action Recog-nition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 6026-6035. [Google Scholar] [CrossRef]
|
|
[11]
|
Zhu, Y., et al. (2018) Hidden Two-Stream Convolu-tional Networks for Action Recognition. 14th Asian Conference on Computer Vision, Perth, 2-6 December 2018, 363-378.
|
|
[12]
|
Lin, J., et al. (2019) TSM: Temporal Shift Module for Efficient Video Understanding. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27-28 October 2019, 7082-7092. [Google Scholar] [CrossRef]
|
|
[13]
|
Jiang, B.Y., et al. (2019) STM: SpatioTemporal and Motion En-coding for Action Recognition. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27-28 October 2019, 2000-2009. [Google Scholar] [CrossRef]
|
|
[14]
|
Li, Y., et al. (2020) TEA: Temporal Excitation and Aggregation for Action Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 14-19 June 2020, 906-915. [Google Scholar] [CrossRef]
|
|
[15]
|
Tran, D., et al. (2018) A Closer Look at Spatiotemporal Convolutions for Action Recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 6450-6459. [Google Scholar] [CrossRef]
|