基于LDA主题模型的杭州亚运会微博话题分析
Analysis of Microblog Topics of Hangzhou Asian Games Based on LDA Topic Model
DOI: 10.12677/SA.2023.124087, PDF,  被引量   
作者: 董韶琦, 郑 静:杭州电子科技大学经济学院,浙江 杭州
关键词: 杭州亚运会微博LDA模型Hangzhou Asian Games Weibo LDA Model
摘要: 为了探索杭州亚运会预热阶段新兴媒体传播的结构和内容,帮助相关部门更高效地进行舆论监管与引导,本文创新性地对亚运会传播内容进行LDA主题模型的构建。本文在新浪微博爬取与杭州亚运会相关内容,构建亚运会文本的隐含狄利克雷分布(Latent Dirichlet Allocation, LDA)模型,采用困惑度评价指标确定模型最优主题数,然后用框架和语境理论分别从结构和内容挖掘相关文本内涵。结果显示,亚运会预热传播内容主要围绕娱乐宣传、参与人员、基础设施、起止仪式、市场合作、竞技项目6个框架展开,展示出人们对杭州举办此次亚运会的肯定与期待;同时亚运会也为我国经济尤其是杭州经济的发展起到一定的促进作用,也为对内对外经济合作提供了契机。
Abstract: In order to explore the structure and content of emerging media communication in the warm-up stage of the Hangzhou Asian Games, and help relevant departments to more effectively supervise and guide public opinion, this paper innovatively constructs the LDA Topic model for the communication content of the Asian Games. This paper builds a Latent Dirichlet Allocation model for the text of the Asian Games by crawling Sina Weibo content related to the Hangzhou Asian Games, uses Perplexity evaluation indicators to determine the optimal number of topics in the model, and then uses the framework and context theory to mine the relevant text content from the structure and content. The results show that the warm-up communication content of the Asian Games mainly revolves around six frameworks: entertainment promotion, participants, infrastructure, start and end ceremonies, market cooperation, and competitive projects, showcasing people’s affirmation and expectation of Hangzhou hosting the Asian Games; At the same time, the Asian Games have also played a certain promoting role in the development of China’s economy, especially in Hangzhou, and provided opportunities for domestic and foreign economic cooperation.
文章引用:董韶琦, 郑静. 基于LDA主题模型的杭州亚运会微博话题分析[J]. 统计学与应用, 2023, 12(4): 833-842. https://doi.org/10.12677/SA.2023.124087

参考文献

[1] 张晨晨. 基于LDA模型的舆情情感主题研究[D]: [硕士学位论文]. 阜阳: 阜阳师范大学, 2022.
[CrossRef
[2] Gupta, A. and Katarya, R. (2021) PAN-LDA: A Latent Dirichlet Allocation Based Novel Feature Extraction Model for COVID-19 Data Using Machine Learning. Computers in Biology and Medicine, 138, Article ID: 104920.
[Google Scholar] [CrossRef] [PubMed]
[3] Zhang, Y.L. and Zhang, L.L. (2022) Movie Recommendation Algorithm Based on Sentiment Analysis and LDA. Procedia Computer Science, 199, 871-878.
[Google Scholar] [CrossRef
[4] Wang, J., Wang, L., Xu, J., et al. (2021) Information Needs Mining of COVID-19 in Chinese Online Health Communities. Big Data Research, 24, Article ID: 100193.
[Google Scholar] [CrossRef
[5] Ozyurt, B. and Akcayol, M.A. (2021) A New Topic Modeling Based Approach for Aspect Extraction in Aspect Based Sentiment Analysis: SS-LDA. Expert Systems with Applications, 168, Article ID: 114231.
[Google Scholar] [CrossRef
[6] 宁宁, 莫秀良, 王春东. 基于融合LDA和Doc2vec算法的文本表示模型的研究[J]. 天津理工大学学报, 2021, 37(2): 55-60.
[7] Sakshi, K.V. (2023) Recent Trends in Mathematical Expressions Recognition: An LDA-Based Analysis. Expert Systems with Applications, 213, Article ID: 119028.
[Google Scholar] [CrossRef
[8] 任勇, 邢天放. 杭州亚运会核心区域雷电特征分析和防御建议[J]. 价值工程, 2022, 41(25): 132-135.
[9] 单凯, 黄斐凡. 良渚文化的对外传播与2022年杭州亚运会宣传[J]. 浙江体育科学, 2021, 43(5): 1-6+18.
[10] 杨柯. 论杭州亚运会延期对媒体议程影响——以地方主流媒体官方微博为例[J]. 新闻研究导刊, 2022, 13(20): 116-118.
[11] 白健, 洪小娟. 基于弹幕的网络舆情文本挖掘与情感分析[J]. 软件工程, 2022, 25(11): 44-48.
[Google Scholar] [CrossRef
[12] 何天文, 王红. 基于语义语法分析的中文语句困惑度评价[J]. 计算机应用研究, 2017, 34(12): 3538-3542, 3546.