|
[1]
|
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M. and Sorkine-Hornung, A. (2016) A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 724-732. [Google Scholar] [CrossRef]
|
|
[2]
|
Xu, N., Yang, L., Fan, Y., Yang, J., Yue, D., Liang, Y., et al. (2018) YouTube-VOS: Sequence-to-Sequence Video Object Segmentation. In: Lecture Notes in Computer Science, Springer, 603-619. [Google Scholar] [CrossRef]
|
|
[3]
|
Zhou, T., Porikli, F., Crandall, D.J., Van Gool, L. and Wang, W. (2023) A Survey on Deep Learning Technique for Video Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 7099-7122. [Google Scholar] [CrossRef] [PubMed]
|
|
[4]
|
Hu, R., Rohrbach, M., Andreas, J., Darrell, T. and Saenko, K. (2017) Modeling Relationships in Referential Expressions with Compositional Modular Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 4418-4427. [Google Scholar] [CrossRef]
|
|
[5]
|
Yu, L., Lin, Z., Shen, X., Yang, J., Lu, X., Bansal, M., et al. (2018). MAttNet: Modular Attention Network for Referring Expression Comprehension. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 1307-1315.[CrossRef]
|
|
[6]
|
Hu, R., Rohrbach, M. and Darrell, T. (2016) Segmentation from Natural Language Expressions. In: Lecture Notes in Computer Science, Springer, 108-124. [Google Scholar] [CrossRef]
|
|
[7]
|
Shi, H., Li, H., Meng, F. and Wu, Q. (2018) Key-Word-Aware Network for Referring Expression Image Segmentation. In: Lecture Notes in Computer Science, Springer, 38-54. [Google Scholar] [CrossRef]
|
|
[8]
|
Khoreva, A., Rohrbach, A. and Schiele, B. (2019) Video Object Segmentation with Language Referring Expressions. In: Lecture Notes in Computer Science, Springer, 123-141. [Google Scholar] [CrossRef]
|
|
[9]
|
Seo, S., Lee, J. and Han, B. (2020) URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark. In: Lecture Notes in Computer Science, Springer, 208-223. [Google Scholar] [CrossRef]
|
|
[10]
|
Liang, C., Wu, Y., Zhou, T., Wang, W., Yang, Z., Wei, Y. and Yang, Y. (2021) Rethinking Cross-Modal Interaction from a Top-Down Perspective for Referring Video Object Segmentation.
|
|
[11]
|
Wu, D., Dong, X., Shao, L. and Shen, J. (2022) Multi-Level Representation Learning with Semantic Alignment for Referring Video Object Segmentation. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orlean, 18-24 June 2022, 4986-4995. [Google Scholar] [CrossRef]
|
|
[12]
|
Li, H., Wu, Z., Shrivastava, A. and Davis, L.S. (2022) Rethinking Pseudo Labels for Semi-Supervised Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 1314-1322. [Google Scholar] [CrossRef]
|
|
[13]
|
Wang, Y., Wang, H., Shen, Y., Fei, J., Li, W., Jin, G., et al. (2022) Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-labels. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 4238-4247. [Google Scholar] [CrossRef]
|
|
[14]
|
Xu, Y., Shang, L., Ye, J., Qian, Q., et al. (2021) Dash: Semi-Supervised Learning with Dynamic Thresholding. International Conference on Machine Learning, Online, 18-24 July 2021, 11525-11536.
|
|
[15]
|
Berthelot, D., Carlini, N., Cubuk, E.D., Kurakin, A., et al. (2019) Remixmatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring. 2019 International Conference on Learning Representation, New Orleans, 6-9 May 2019.
|
|
[16]
|
Xie, Q., Luong, M., Hovy, E. and Le, Q.V. (2020) Self-Training with Noisy Student Improves ImageNet Classification. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 10684-10695. [Google Scholar] [CrossRef]
|
|
[17]
|
Settles, B. (2009) Active Learning Literature Survey. Computer Sciences Technical Report 1648.
|
|
[18]
|
Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A. and Van Gool, L. (2017) The 2017 Davis Challenge on Video Object Segmentation.
|
|
[19]
|
Li, D., Li, R., Wang, L., Wang, Y., Qi, J., Zhang, L., et al. (2022) You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 1297-1305. [Google Scholar] [CrossRef]
|
|
[20]
|
He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef]
|
|
[21]
|
Tarvainen, A. and Valpola, H. (2017) Mean Teachers Are Better Role Models: Weight-Averaged Consistency Targets Improve Semi-Supervised Deep Learning Results. 2017 Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017.
|
|
[22]
|
Bellver, M., Ventura, C., Silberer, C., Kazakos, I., Torres, J. and Giro-i-Nieto, X. (2022) A Closer Look at Referring Expressions for Video Object Segmentation. Multimedia Tools and Applications, 82, 4419-4438. [Google Scholar] [CrossRef]
|
|
[23]
|
Liu, S., Hui, T., Huang, S., Wei, Y., Li, B. and Li, G. (2021) Cross-Modal Progressive Comprehension for Referring Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 4761-4775.
|
|
[24]
|
Ding, Z., Hui, T., Huang, J., Wei, X., Han, J. and Liu, S. (2022) Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 4954-4963. [Google Scholar] [CrossRef]
|
|
[25]
|
Feng, G., Zhang, L., Hu, Z. and Lu, H. (2022) Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation.
|
|
[26]
|
Liang, C., Wang, W., Zhou, T., Miao, J., Luo, Y. and Yang, Y. (2023) Local-Global Context Aware Transformer for Language-Guided Video Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 10055-10069. [Google Scholar] [CrossRef] [PubMed]
|