|
[1]
|
Li, S., Xiao, T., Li, H., Zhou, B., Yue, D. and Wang, X. (2017) Person Search with Natural Language Description. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1970-1979. [Google Scholar] [CrossRef]
|
|
[2]
|
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S. and Sutskever, I. (2021) Learning Transferable Visual Models from Natural Language Supervision. International Conference on Machine Learning, Vienna, 18-24 July 2021, 8748-8763.
|
|
[3]
|
Li, S., Xiao, T., Li, H., Yang, W. and Wang, X. (2017) Identity-Aware Textual-Visual Matching with Latent Co-attention. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 1890-1899. [Google Scholar] [CrossRef]
|
|
[4]
|
Chen, T., Xu, C. and Luo, J. (2018) Improving Text-Based Person Search by Spatial Matching and Adaptive Threshold. 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, 12-15 March 2018, 1879-1887. [Google Scholar] [CrossRef]
|
|
[5]
|
Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. https://arxiv.org/pdf/1409.1556
|
|
[6]
|
Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. [Google Scholar] [CrossRef] [PubMed]
|
|
[7]
|
Han, X., He, S., Zhang, L. and Xiang, T. (2021) Text-Based Person Search with Limited Data. [Google Scholar] [CrossRef]
|
|
[8]
|
Yan, S., Dong, N., Zhang, L. and Tang, J. (2023) Clip-Driven Fine-Grained Text-Image Person Re-Identification. IEEE Transactions on Image Processing, 32, 6032-6046. [Google Scholar] [CrossRef] [PubMed]
|
|
[9]
|
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., et al. (2017) Attention Is All You Need. Advances in Neural Information Processing Systems, 30, 5998-6008.
|
|
[10]
|
Jia, C., Yang, Y., Xia, Y., Chen, Y.T., et al. (2021) Scaling up Visual and Vision-Language Representation Learning with Noisy Text Supervision. International Conference on Machine Learning, Vienna, 18-24 July 2021, 4904-4916.
|
|
[11]
|
Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2019) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171-4186.
|
|
[12]
|
Zhang, Y. and Lu, H. (2018) Deep Cross-Modal Projection Learning for Image-Text Matching. In: Lecture Notes in Computer Science, Springer, 707-723. [Google Scholar] [CrossRef]
|
|
[13]
|
Sarafianos, N., Xu, X. and Kakadiaris, I. (2019) Adversarial Representation Learning for Text-to-Image Matching. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 5814-5824. [Google Scholar] [CrossRef]
|
|
[14]
|
Wang, Z., Fang, Z., Wang, J. and Yang, Y. (2020) Vitaa: Visual-Textual Attributes Alignment in Person Search by Natural Language. In: Lecture Notes in Computer Science, Springer, 402-420. [Google Scholar] [CrossRef]
|
|
[15]
|
Gao, C., Cai, G., Jiang, X., Zheng, F., et al. (2021) Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search. https://arxiv.org/pdf/2101.03036
|
|
[16]
|
Zhu, A., Wang, Z., Li, Y., Wan, X., Jin, J., Wang, T., et al. (2021) DSSL: Deep Surroundings-Person Separation Learning for Text-Based Person Retrieval. Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, 20-24 October 2021, 209-217. [Google Scholar] [CrossRef]
|
|
[17]
|
Ding, Z., Ding, C., Shao, Z. and Tao, D. (2021) Semantically Self-Aligned Network for Text-to-Image Part-Aware Person Re-Identification. https://arxiv.org/pdf/2107.12666
|
|
[18]
|
Yan, S., Tang, H., Zhang, L. and Tang, J. (2024) Image-Specific Information Suppression and Implicit Local Alignment for Text-Based Person Search. IEEE Transactions on Neural Networks and Learning Systems, 35, 17973-17986. [Google Scholar] [CrossRef] [PubMed]
|
|
[19]
|
Wang, Z., Zhu, A., Xue, J., Wan, X., Liu, C., Wang, T., et al. (2022) Look before You Leap: Improving Text-Based Person Retrieval by Learning a Consistent Cross-Modal Common Manifold. Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, 10-14 October 2022, 1984-1992. [Google Scholar] [CrossRef]
|
|
[20]
|
Li, S., Cao, M. and Zhang, M. (2022) Learning Semantic-Aligned Feature Representation for Text-Based Person Search. 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23-27 May 2022, 2724-2728. [Google Scholar] [CrossRef]
|
|
[21]
|
Chen, Y., Zhang, G., Lu, Y., Wang, Z. and Zheng, Y. (2022) TIPCB: A Simple but Effective Part-Based Convolutional Baseline for Text-Based Person Search. Neurocomputing, 494, 171-181. [Google Scholar] [CrossRef]
|
|
[22]
|
Shu, X., Wen, W., Wu, H., Chen, K., Song, Y., Qiao, R., et al. (2022) See Finer, See More: Implicit Modality Alignment for Text-Based Person Retrieval. In: Lecture Notes in Computer Science, Springer, 624-641. [Google Scholar] [CrossRef]
|
|
[23]
|
Liu, Y., Li, Y., Liu, Z., Yang, W., Wang, Y. and Liao, Q. (2024) Clip-Based Synergistic Knowledge Transfer for Text-Based Person Retrieval. 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, 14-19 April 2024, 7935-7939. [Google Scholar] [CrossRef]
|
|
[24]
|
Huang, Y., Zhang, C., Li, Z., Wang, Z. and Wei, C. (2025) Prototypical Graph Alignment for Text-Based Person Search. 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, 6-11 April 2025, 1-5. [Google Scholar] [CrossRef]
|