多尺度特征融合CNN的高效电商商品分类研究
Research on Efficient E-Commerce Commodity Classification Using Multi-Scale Feature Fusion CNN
摘要: 针对电商图像商品尺度跨度大、背景干扰多及分类实时性要求高等挑战,本文提出一种多尺度特征融合的高效卷积神经网络模型MSFF-EComNet。首先,设计多尺度并行卷积架构(MSFF),利用多分支感受野同步捕获商品的局部细节与全局语义特征;其次,引入改进的注意力协同机制(IASM),通过空间与通道维度的联合增强,引导模型精准聚焦商品主体并有效抑制背景噪声。实验结果表明,在DeepFashion数据集上,该模型的精确率达到89.56%。相较于经典ResNet-50,本文模型准确率提升4.44%,参数量压缩54%。该研究为大规模电商场景下的图像识别提供了一种兼顾精度与效率的平衡方案。
Abstract: To address the challenges of large-scale variations in e-commerce product images, numerous background interferences, and high real-time classification requirements, this paper proposes an efficient convolutional neural network model, MSFF-EComNet, featuring multi-scale feature fusion. First, a multi-scale parallel convolution architecture (MSFF) is designed to simultaneously capture local details and global semantic features through multi-branch receptive fields. Second, an improved attention synergy mechanism (IASM) is introduced to enhance spatial and channel dimensions, guiding the model to precisely focus on the main object while effectively suppressing background noise. Experimental results demonstrate that the model achieves an accuracy of 89.56% on the DeepFashion dataset. Compared to the classic ResNet-50, our model achieves a 4.44% accuracy improvement while reducing parameters by 54%. This research provides a balanced solution that prioritizes both accuracy and efficiency for image recognition in large-scale e-commerce scenarios.
文章引用:张泰恒. 多尺度特征融合CNN的高效电商商品分类研究[J]. 电子商务评论, 2026, 15(5): 84-91. https://doi.org/10.12677/ecl.2026.155491

参考文献

[1] 季长清, 高志勇, 秦静, 等. 基于卷积神经网络的图像分类算法综述[J]. 计算机应用, 2022, 42(4): 1044-1049.
[2] 夏文生, 夏晓婧, 邹金宝. 基于卷积神经网络的遥感图像分类综述[J]. 机电工程技术, 2025, 54(21): 1-8.
[3] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2017) ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 60, 84-90. [Google Scholar] [CrossRef
[4] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[5] Liu, Z., Luo, P., Qiu, S., Wang, X. and Tang, X. (2016) DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 1096-1104. [Google Scholar] [CrossRef
[6] Lin, T.Y., et al. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 2117-2125.
[7] Hu, J., Shen, L. and Sun, G. (2018) Squeeze-And-Excitation Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7132-7141. [Google Scholar] [CrossRef
[8] 朱子飘. 基于多尺度特征融合的医学图像分类研究[D]: [硕士学位论文]. 南宁: 广西大学, 2025.
[9] 陈超, 齐峰. 卷积神经网络的发展及其在计算机视觉领域中的应用综述[J]. 计算机科学, 2019, 46(3): 63-73.
[10] Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv: 1409.1556.
[11] Howard, A.G., et al. (2017) MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv: 1704.04861.
[12] Zhang, X., Zhou, X.Y., Lin, M.X. and Sun, J. (2018) ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 6848-6856. [Google Scholar] [CrossRef
[13] Szegedy, C., Wei Liu, Yangqing Jia, Sermanet, P., Reed, S., Anguelov, D., et al. (2015) Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 1-9. [Google Scholar] [CrossRef
[14] Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F. and Adam, H. (2018) Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C. AND Weiss, Y., Eds., Computer VisionECCV 2018, Springer, 801-818. [Google Scholar] [CrossRef
[15] Woo, S., Park, J., Lee, J.Y. and Kweon, I.S. (2018) CBAM: Convolutional Block Attention Module. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer VisionECCV 2018, Springer, 3-19. [Google Scholar] [CrossRef
[16] Wang, Q., et al. (2020) ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11531-11539.
[17] Szegedy, C., Ioffe, S., Vanhoucke, V. and Alemi, A. (2017). Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31, 4278-4284.[CrossRef
[18] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[19] Howard, A., Sandler, M., Chu, G., et al. (2019) Searching for MobileNetV3. 2019 Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 1314-1324. [Google Scholar] [CrossRef
[20] Tan, M. and Le, Q.V. (2019) EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. International Conference on Machine Learning (ICML). arXiv: 1905.11946.