|
[1]
|
Jing, L. and Tian, Y. (2018) Self-Supervised Spatiotemporal Feature Learning by Video Geometric Transfor-mations.
|
|
[2]
|
Ahsan, U., Madhok, R. and Essa, I. (2019) Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, 7-11 January 2019, 179-189. [Google Scholar] [CrossRef]
|
|
[3]
|
Xu, D., Xiao, J., Zhao, Z., et al. (2019) Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 10326-10335. [Google Scholar] [CrossRef]
|
|
[4]
|
Yao, Y., Liu, C., Luo, D., et al. (2020) Video Playback Rate Per-ception for Self-Supervised Spatio-Temporal Representation Learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 6547-6556. [Google Scholar] [CrossRef]
|
|
[5]
|
Benaim, S., Ephrat, A., Lang, O., et al. (2020) SpeedNet: Learning the Speediness in Videos. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 9919-9928. [Google Scholar] [CrossRef]
|
|
[6]
|
Liang, H., Quader, N., Chi, Z., et al. (2021) Self-Supervised Spatiotemporal Representation Learning by Exploiting Video Continuity. The 36th AAAI Conference on Artificial Intelli-gence (AAAI-22), 22 February-1 March 2022, 1564-1573.
|
|
[7]
|
Kim, D., Cho, D. and Kweon, I.S. (2018) Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8545-8552. [Google Scholar] [CrossRef]
|
|
[8]
|
Piergiovanni, A.J., Angelova, A. and Ryoo, M.S. (2020) Evolving Losses for Unsupervised Video Representation Learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 130-139. [Google Scholar] [CrossRef]
|
|
[9]
|
Huang, L., Liu, Y., Wang, B., et al. (2021) Self-Supervised Video Representation Learning by Context and Motion Decoupling. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 13881-13890. [Google Scholar] [CrossRef]
|
|
[10]
|
Dave, I., Gupta, R., Rizve, M.N., et al. (2021) TCLR: Tem-poral Contrastive Learning for Video Representation. Computer Vision and Image Understanding, 219, Article ID: 103406. [Google Scholar] [CrossRef]
|
|
[11]
|
Wang, J., Jiao, J. and Liu, Y.H. (2020) Self-Supervised Video Representation Learning by Pace Prediction. Computer Vision—ECCV 2020 16th European Conference, Glasgow, 23-28 August 2020, 504-521.
|
|
[12]
|
Bai, Y., Fan, H., Misra, I., et al. (2020) Can Temporal Information Help with Con-trastive Self-Supervised Learning?
|
|
[13]
|
Kay, W., Carreira, J., Simonyan, K., et al. (2017) The Kinetics Human Action Video Dataset.
|
|
[14]
|
Soomro, K., Zamir, A.R. and Shah, M. (2012) UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild.
|
|
[15]
|
Kuehne, H., Jhuang, H., Garrote, E., et al. (2011) HMDB: A Large Video Database for Human Motion Recognition. IEEE International Conference on Computer Vision, Barcelona, 6-13 November 2011, 2556-2563. [Google Scholar] [CrossRef]
|
|
[16]
|
Chen, X., Xie, S. and He, K. (2021) An Empirical Study of Training Self-Supervised Vision Transformers. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 9620-9629. [Google Scholar] [CrossRef]
|
|
[17]
|
Feichtenhofer, C., Fan, H., Malik, J., et al. (2019) SlowFast Networks for Video Recognition. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 Oc-tober-2 November 2019, 6201-6210. [Google Scholar] [CrossRef]
|
|
[18]
|
Behrmann, N., Fayyaz, M., Gall, J., et al. (2021) Long Short View Feature Decomposition via Contrastive Video Representation Learning. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 9224-9233. [Google Scholar] [CrossRef]
|
|
[19]
|
Wang, J., Gao, Y., Li, K., et al. (2021) Removing the Back-ground by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 11799-11808. [Google Scholar] [CrossRef]
|
|
[20]
|
Han, T., Xie, W. and Zisserman, A. (2020) Self-Supervised Co-Training for Video Representation Learning.
|
|
[21]
|
Luo, D., Fang, B., Zhou, Y., et al. (2020) Exploring Relations in Untrimmed Videos for Self-Supervised Learning. ACM Transactions on Multimedia Computing, Communications, and Applications, 18, Article No. 35.
|
|
[22]
|
Liu, Y., Wang, K., Lan, H., et al. (2021) Temporal Contrastive Graph Learning for Video Action Recognition and Retrieval.
|
|
[23]
|
Zhang, Y., Po, L.M., Xu, X., et al. (2021) Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation. The 36th AAAI Conference on Artificial Intelligence (AAAI-22), 22 February-1 March 2022, 3380-3389.
|