基于大语言模型的电商用户评论结构化分析方法研究
Research on Structured Analysis Method of E-Commerce User Reviews Based on Large Language Model
DOI: 10.12677/ecl.2026.153357, PDF,   
作者: 熊 丹:贵州大学省部共建公共大数据国家重点实验室,贵州 贵阳
关键词: 电商评论分析大语言模型语义表示BERT模型E-Commerce Review Analysis Large Language Model Semantic Representation BERT Model
摘要: 电商评论数据挖掘对于提升运营效率和用户满意度具有实际应用价值。针对海量电商评论口语化、碎片化的特点,以及传统关键词匹配方法语义理解不足、主题模型结果晦涩难懂的缺陷,本文提出预训练语言模型PLM与大语言模型LLM深度协同的结构化分析方法,构建了从非结构化文本到结构化决策支持的自动化流程。该方法经BERT语义提取、主成分分析PCA降维、K-Means聚类得到各主题代表性样本,将这些样本作为少量示例融入提示词工程输入DeepSeek模型,让LLM适配电商评论表述特点,提升问题提取与结构化输出准确性。基于Amazon商品评论数据集的实验验证表明,该协同方法在聚类质量、问题提取精度及结构化输出效果上均优于传统基线方法,其输出成果可直接为电商运营决策提供数据支撑,具备良好的实用价值。
Abstract: E-commerce review data mining has practical application value for improving operational efficiency and user satisfaction. Addressing the characteristics of massive e-commerce reviews being colloquial and fragmented, and the shortcomings of traditional keyword matching methods such as insufficient semantic understanding and obscure topic model results, this paper proposes a structured analysis method that deeply collaborates with a pre-trained language model (PLM) and a large language model (LLM), constructing an automated process from unstructured text to structured decision support. This method uses BERT semantic extraction, principal component analysis (PCA) dimensionality reduction, and K-Means clustering to obtain representative samples for each topic. These samples are then used as a small set of examples and input into the DeepSeek model via prompt word engineering, helping the LLM accurately adapt to the characteristics of e-commerce review expressions and improving the accuracy of question extraction and structured output. Experimental validation based on the Amazon product review dataset shows that this collaborative method outperforms traditional baseline methods in clustering quality, question extraction accuracy, and structured output performance. Its output can directly provide data support for e-commerce operational decisions and have significant practical value.
文章引用:熊丹. 基于大语言模型的电商用户评论结构化分析方法研究[J]. 电子商务评论, 2026, 15(3): 970-979. https://doi.org/10.12677/ecl.2026.153357

参考文献

[1] 金韬凝, 赵宏石, 李赫, 等. 电商产品评论的情感分类方法研究[J]. 内蒙古财经大学学报, 2025, 23(3): 136-142.
[2] 贾强, 冯锡炜, 王志峰, 等. 基于改进的TF-IDF文本特征词提取算法研究[J]. 辽宁石油化工大学学报, 2017, 37(4): 61-64+69.
[3] 朱茂然, 王奕磊, 高松, 等. 基于LDA模型的主题演化分析: 以情报学文献为例[J]. 北京工业大学学报, 2018, 44(7): 1047-1053.
[4] 林江豪, 阳爱民, 周咏梅, 等. 一种基于朴素贝叶斯的微博情感分类[J]. 计算机工程与科学, 2012, 34(9): 160-165.
[5] 涂远来, 周家乐, 王慧锋. 基于BERT预训练模型的事故案例文本分类方法[J]. 华东理工大学学报(自然科学版), 2023, 49(4): 576-582.
[6] 杨立公, 朱俭, 汤世平. 文本情感分析综述[J]. 计算机应用, 2013, 33(6): 1574-1578.
[7] 史振杰, 董兆伟, 庞超逸, 等. 基于BERT-CNN的电商评论情感分析[J]. 智能计算机与应用, 2020, 10(2): 7-11.
[8] 汪仁琪, 张庆华, 朱子墨, 等. 融合情感偏离度特征的评论有用性预测模型——基于深度学习框架[J]. 电子商务评论, 2025, 14(4): 1147-1154. [Google Scholar] [CrossRef
[9] 韩毅, 乔林波, 李东升, 等. 知识增强型预训练语言模型综述[J]. 计算机科学与探索, 2022, 16(7): 1439-1461.
[10] 宁秦伟, 丁苍峰, 马乐荣, 等. 面向电子商务的属性值提取研究进展[J]. 计算机应用研究, 2025, 42(9): 2572-2582.
[11] 梁盈, 李学宁. AIGC在电子商务营销中的应用及信息真实性挑战[J]. 电子商务评论, 2025, 14(5): 1381-1387. [Google Scholar] [CrossRef
[12] Gu, Y., Han, X., Liu, Z. and Huang, M. (2022) PPT: Pre-Trained Prompt Tuning for Few-Shot Learning. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, May 2022, 8410-8423. [Google Scholar] [CrossRef