基于标签一致性哈希的跨模态检索算法
Label Consistency Hashing for Cross-Modal Retrieval
DOI: 10.12677/CSA.2021.114114, PDF,   
作者: 刘志虎:广东工业大学计算机学院,广东 广州
关键词: 跨模态检索哈希矩阵分解Cross-Modal Retrieval Hashing Matrix Factorization
摘要: 针对跨模态检索任务中,不同数据之间存在异构性以及语义鸿沟等特点,本文提出了一种新的监督哈希方法。该方法利用矩阵分解学习训练数据集在低维潜在语义空间表示,同时本文将标签信息也视为一个单独的模态,也利用矩阵分解将其映射到低维潜在语义子空间中;然后,在子空间中最大化它们之间的相关性,从而得到相应的低维潜在语义代表;之后,本文利用正交旋转矩阵学习性能更好的哈希函数得到相应的哈希码。在三个常用的数据集Wiki,MIRFlick和NUS-WIDE进行了大量的实验,并与一些常用的跨模态哈希方法进行了比较,结果证明了该算法的优越性。
Abstract: In view of the heterogeneity and semantic gap between different data in cross-modal retrieval, a new supervised hash method is proposed. This method uses matrix factorization technique learning training data set to represent the low-dimensional potential semantic space. At the same time, this paper considers the semantic features as a separate mode, and maps them to the low-dimensional latent semantic subspace by using matrix factorization, then maximizes the correlation among them in the subspace, and obtains the corresponding low-dimensional potential semantic representation. After that, the hash codes are obtained by using the hash function with better learning performance of orthogonal rotation matrix. A lot of experiments have been carried out in three commonly used data sets Wiki, MIRFlick and NUS-WIDE, and compared with some common cross-modal hashing methods, the results show the superior of this algorithm.
文章引用:刘志虎. 基于标签一致性哈希的跨模态检索算法[J]. 计算机科学与应用, 2021, 11(4): 1104-1112. https://doi.org/10.12677/CSA.2021.114114

参考文献

[1] Ding, G., Guo, Y. and Zhou, J. (2014) Collective Matrix Factorization Hashing Formultimodal Data. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, 23-28 June 2014, 2083-2090. [Google Scholar] [CrossRef
[2] Zhou, J., Ding, G. and Guo, Y. (2014) Latent Semantic Sparse Hashing for Cross-Modal Similarity Search. ACM SIGIR International Conference Research Development in Infor-mation Retrieval, Queensland, July 2014, 415-424. [Google Scholar] [CrossRef
[3] Song, J.K., Yang, Y., Yang, Y., et al. (2013) Inter-Media Hashing for Large-Scale Retrieval from Heterogeneous Data Sources. Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, New York, June 2013, 785-796. [Google Scholar] [CrossRef
[4] Zhang, D. and Li, W.J. (2014) Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization. Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI), 28, 2177-2183.
[5] Tang, J., Wang, K. and Shao, L. (2016) Supervised Matrix Factorization Hashing for Cross-Modal Retrieval. IEEE Transactions on Image Processing, 25, 3157-3166. [Google Scholar] [CrossRef
[6] Lin, Z., Ding, G., Hu, M., et al. (2015) Semantics-Preserving Hashing for Cross-View Retrieval. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June 2015, 3864-3872. [Google Scholar] [CrossRef
[7] Schonemann, P.H. (1966) A Generalized solution of the Or-thogonal Procrustes Problem. Psychometrika, 31, 1-10. [Google Scholar] [CrossRef
[8] Lu, X., Zhang, H., Sun, J., et al. (2018) Discriminative Correlation Hashing for Supervised Cross-Modal Retrieval. Signal Processing: Image Communication, 65, 221-230. [Google Scholar] [CrossRef
[9] Chua, T.S., Tang, J., Hong, R., et al. (2009) Nus-Wide: A Re-al-World Web Image Database from National University of Singapore. ACM International Conference on Image and Video Retrieval, Article No. 48, 1-9. [Google Scholar] [CrossRef
[10] Huiskes, M.J. and Lew, M.S. (2008) The MIR Flickr Retrieval Evaluation. Proceedings of the 1st ACM international conference on Multimedia information retrieval, New York, Octo-ber 2008, 39-43. [Google Scholar] [CrossRef