基于卷积块注意力模型的服装图像检索方法
Clothing Image Retrieval Method Based on Convolutional Block Attention Model
DOI: 10.12677/CSA.2022.125132, PDF,    国家科技经费支持
作者: 宣益亮, 宿轩策:东华大学信息科学与技术学院,上海;廖小飞*:东华大学信息科学与技术学院,上海;东华大学数字化纺织服装技术教育部工程研究中心,上海
关键词: 深度学习服装检索注意力机制残差网络Deep Learning Clothing Retrieval Attention Mechanism Residual Network
摘要: 针对服装图像检索应用在日常场景下拍摄的服装图像难以避免各种噪声的干扰,如背景或遮挡,严重影响特征提取的准确性,导致检索精度较差等问题,提出一种基于卷积神经网络结合注意力机制的服装图像检索方法,即在ResNet50特征提取网络的基础上加入一种轻量级的通用注意力模块。通过对通道和空间两个独立维度提取特征图,提升在特征提取过程中服装区域的关注程度,压制背景区域,从而提高图像特征的表达能力。通过Triplet Loss损失函数进行网络训练,计算特征向量间的欧氏距离度量图像的相似性。所提方法在DeepFashion数据集上与其他检索方法进行了比较,结果表明该方法能够有效排除图像背景干扰,提高检索精度。
Abstract: For clothing image retrieval applications in daily scenes, it is difficult to avoid the interference of various noises, such as background or occlusion, which seriously affects the accuracy of feature extraction, resulting in poor retrieval accuracy. A clothing image retrieval method based on convolutional neural network combined with attention mechanism is proposed, that is, a lightweight general attention module is added on the basis of ResNet50 feature extraction network. By extracting the feature map from two independent dimensions of channel and space, the attention of the clothing area in the process of feature extraction is increased, and the background area is suppressed, thereby improving the expressive ability of image features. The network is trained through the Triplet Loss function, and the Euclidean distance between the feature vectors is calculated to measure the similarity of the images. The proposed method is compared with other retrieval methods on the DeepFashion dataset, and the results show that the method can effectively eliminate the image background interference and improve the retrieval accuracy.
文章引用:宣益亮, 廖小飞, 宿轩策. 基于卷积块注意力模型的服装图像检索方法[J]. 计算机科学与应用, 2022, 12(5): 1331-1340. https://doi.org/10.12677/CSA.2022.125132

参考文献

[1] Goei, K., Hendriksen, M., de Rijke, M., et al. (2021) Tackling Attribute Fine-Grainedness in Cross-Modal Fashion Search with Multi-Level Features. SIGIR 2021 Workshop on eCommerce, Montreal, 15 July 2021.
[2] Li, S., Wang, Z. and Zhu, Q. (2020) A Research of ORB Feature Matching Algorithm Based on Fusion Descriptor. 2020 IEEE 5th In-formation Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, 12-14 June 2020, 417-420. [Google Scholar] [CrossRef
[3] Yang, M., He, D., Fan, M., et al. (2021) DOLG: Sin-gle-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 11752-11761. [Google Scholar] [CrossRef
[4] Li, S., Wang, L., Li, J. and Yao, Y. (2021) Image Classifica-tion Algorithm Based on Improved AlexNet. Journal of Physics: Conference Series, 1813, Article ID: 012051. [Google Scholar] [CrossRef
[5] D’Innocente, A., Garg, N., Zhang, Y., et al. (2021) Local-ized Triplet Loss for Fine-Grained Fashion Image Retrieval. 2021 IEEE/CVF Conference on Computer Vision and Pat-tern Recognition Workshops (CVPRW), Nashville, 19-25 June 2021, 3905-3910. [Google Scholar] [CrossRef
[6] Sharma, V., Murray, N., Larlus, D., et al. (2021) Unsu-pervised Meta-Domain Adaptation for Fashion Retrieval. 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, 3-8 January 2021, 1347-1356. [Google Scholar] [CrossRef
[7] Su, H., Wang, P., Liu, L., et al. (2020) Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network. IEEE Transactions on Circuits and Systems for Video Technology, 31, 3254-3265.
[8] Tesfaye, A.L. and Pelillo, M. (2018) Multi-Feature Fusion for Image Retrieval Using Constrained Dominant Sets. Image and Vision Computing, 94, Article ID: 103862.
[9] Woo, S., Park, J., Lee, J.Y., et al. (2018) CBAM: Convolutional Block Attention Module. European Conference on Computer Vision, Munich, 8-14 September 2018, 3-19. [Google Scholar] [CrossRef
[10] Kuang, Z., Gao, Y., Li, G., et al. (2019) Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 3066-3075. [Google Scholar] [CrossRef
[11] Barz, B. and Denzler, J. (2021) Content-Based Image Retrieval and the Semantic Gap in the Deep Learning Era. International Conference on Pattern Recognition, Springer, Cham, 245-260. [Google Scholar] [CrossRef
[12] Sun, Y., Wong, W.K. and Zou, X. (2021) A Multi-Task Model for Multi-Attribute Fashion Recognition and Retrieval. AATCC Journal of Research, 8, 105-116.
[13] Hu, J., Shen, L., Albanie, S., et al. (2019) Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Ma-chine Intelligence, 7132-7141. .
[14] He, T. and Hu, Y. (2018) FashionNet: Personalized Outfit Recommendation with Deep Neural Network. arXiv:1810.02443.
[15] Lang, Y., He, Y., Yang, F., et al. (2020) Which Is Plagiarism: Fashion Image Retrieval Based on Regional Representation for Design Protection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recog-nition (CVPR), Seattle, 14-19 June 2020, 2595-2604.
[16] Li, Z., Liu, F., Yang, W., et al. (2021) A Survey of Con-volutional Neural Networks: Analysis, Applications, and Prospects. IEEE Transactions on Neural Networks and Learn-ing Systems, 1-21.
[17] Morelli, D., Cornia, M. and Cucchiara, R. (2021) FashionSearch++: Improving Consum-er-to-Shop Clothes Retrieval with Hard Negatives. Italian Information Retrieval Workshop, Bari, 13-15 September 2021.
[18] Ge, Y., Zhang, R., Wu, L., et al. (2019) DeepFashion2: A Versatile Benchmark for Detection, Pose Estima-tion, Segmentation and Re-Identification of Clothing Images. 2019 IEEE/CVF Conference on Computer Vision and Pat-tern Recognition (CVPR), Long Beach, 15-20 June 2019, 5332-5340. [Google Scholar] [CrossRef
[19] Selvaraju, R.R., Cogswell, M., Das, A., et al. (2020) Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. International Journal of Computer Vision, 128, 336-359. [Google Scholar] [CrossRef