基于大语言模型与思维链机制的专利功效词自动抽取研究
Automatic Extraction of Patent Effect Terms Based on Large Language Models and Chain-of-Thought Mechanisms
摘要: 专利功效词是构建技术功效矩阵与技术功效图的基础,也是技术情报分析与创新决策的重要语义单元。然而,由于专利文本专业性强、结构复杂且功效表述分散,现有自动抽取方法在准确性与可控性方面仍存在不足。大语言模型在长文本理解方面具有优势,但仅依赖其隐式推理能力,难以稳定识别与核心技术相关的关键功效。针对上述问题,本文将专利功效词抽取重构为引入推理约束的生成式任务,重点探讨思维链机制在该任务中的作用。通过对比无显式推理、通用分步推理以及融入专利分析专家认知逻辑的人工思维链方法,系统分析不同推理引导策略对抽取效果的影响。实验结果表明,引入专家认知逻辑并施加因果约束的思维链设计,能够显著提升抽取质量与稳定性,为技术功效图构建与创新决策支持提供可靠数据基础。
Abstract: Patent effect terms constitute the foundational semantic units for constructing technology effect matrices and technology effect maps, and they play a critical role in technology intelligence analysis and innovation decision-making. However, due to the high level of specialization, complex structure, and dispersed expression of effect descriptions in patent texts, existing automatic extraction methods still exhibit limitations in terms of accuracy and controllability. Although large language models demonstrate strong capabilities in long-text understanding, relying solely on their implicit reasoning abilities makes it difficult to consistently identify key effect terms closely related to core technologies. To address this issue, this study reconstructs the task of patent effect term extraction as a generation-based task guided by explicit reasoning constraints, with a particular focus on the role of the chain-of-thought mechanism. By comparing approaches without explicit reasoning, general step-by-step reasoning, and manually designed chain-of-thought strategies incorporating the cognitive logic of patent analysis experts, this paper systematically examines the impact of different reasoning guidance strategies on extraction performance. Experimental results indicate that a chain-of-thought design integrating expert cognitive logic and causal constraints can significantly improve extraction quality and stability, thereby providing reliable data support for technology effect map construction and innovation decision-making.
参考文献
|
[1]
|
Trappey, A.J.C., Trappey, C.V., Govindarajan, U.H. and Jhuang, A.C.C. (2018) Construction and Validation of an Ontology-Based Technology Function Matrix: Technology Mining of Cyber Physical System Patent Portfolios. World Patent Information, 55, 19-24. [Google Scholar] [CrossRef]
|
|
[2]
|
Choi, J., Jun, S. and Park, S. (2016) A Patent Analysis for Sustainable Technology Management. Sustainability, 8, Article 688. [Google Scholar] [CrossRef]
|
|
[3]
|
Wang, X., Zhou, W., Zu, C., et al. (2024) InstructUIE: Multi-Task Instruction Tuning for Unified Information Extraction. https://arxiv.org/pdf/2304.08085.pdf
|
|
[4]
|
陈颖, 张晓林. 专利中技术词和功效词识别方法研究[J]. 现代图书情报技术, 2011(12): 24-30.
|
|
[5]
|
Huang, J.Y. and Hsu, H.T. (2017) Technology-Function Matrix Based Network Analysis of Cloud Computing. Scientometrics, 113, 17-44. [Google Scholar] [CrossRef]
|
|
[6]
|
段庆锋, 蒋保建. 基于SAO结构的专利技术功效图构建研究[J]. 现代情报, 2017, 37(6): 48-54.
|
|
[7]
|
翟东升, 张京先, 胡等金. 基于SAO结构和词向量的专利技术功效图自动构建研究[J]. 情报理论与实践, 2020, 43(3): 116-123.
|
|
[8]
|
刘春江, 李姝影, 刘自强, 等. 基于BERT与BiGRU-CRF的专利技术功能与效果抽取研究[J]. 情报理论与实践, 2023, 46(12): 167-174.
|
|
[9]
|
白如江, 陈启明, 张玉洁, 等. 基于ChatGPT + Prompt的专利技术功效实体自动生成研究[J]. 数据分析与知识发现, 2024, 8(4): 14-25.
|
|
[10]
|
王奎芳, 吕璐成, 孙文君, 等. 基于大模型知识蒸馏的专利技术功效词自动抽取方法研究[J]. 数据分析与知识发现, 2024, 8(z1): 144-156.
|