一种针对Python代码的提取与分析方法研究——理论与实证研究
A Python Code Extraction and Analysis Method—Theoretical and Empirical Research
摘要: 随着人工智能与大数据技术的快速发展,代码知识结构的自动化提取与可视化分析逐渐成为程序理解、教学辅助及智能开发的重要方向。在当前软件工程与教育实践中,Python作为应用最广泛的编程语言之一,其代码所涵盖的知识点类型多样、依赖关系复杂,传统人工解析方式效率低且缺乏系统性。为解决知识点识别不精准、代码结构难以直观呈现、知识复用程度不足等问题,本文提出一种针对Python代码的提取与分析方法。该方法通过构建专家知识库,提取Python编程中的关键知识点及其对应指令;基于抽象语法树(AST)对代码进行解析,将语法结构与知识点进行逐级匹配;通过出现频次、调用深度与依赖关系的综合权重模型计算知识点的重要程度;并基于三元组结构构建“知识点–关键指令–关系类型”的知识关联体系。随后利用NetworkX与Pyecharts等工具实现知识网络拓扑、交互式图谱与词云图的可视化呈现,使代码知识结构能够以直观、系统化的方式展示。实验结果表明,该方法能够有效揭示代码中知识点分布规律与逻辑关联,为编程教学、教材分析、智能代码分析系统等应用提供可靠支撑。本研究不仅验证了Python代码知识点可视化体系构建的可行性,还为教学管理、学习路径规划及智能编程辅助工具的发展提供了新思路。
Abstract: With the rapid development of artificial intelligence and big data technologies, the automated extraction and visualization of code knowledge structures have become essential in program comprehension, teaching assistance, and intelligent software development. As one of the most widely-used programming languages, Python contains diverse types of knowledge points and complex dependency relationships, making traditional manual analysis inefficient and unsystematic. To address the problems of inaccurate knowledge point identification, limited code structure interpretability, and low knowledge reusability, this paper proposes a Python-oriented method for knowledge extraction and analysis. The method constructs an expert knowledge base that maps Python knowledge points to critical instructions. Using the Abstract Syntax Tree (AST), Python code is decomposed into structural units and matched with expert knowledge entries. A comprehensive weighting model—integrating instruction frequency, call depth, and dependency relations—is designed to evaluate the importance of each knowledge point. Furthermore, a structured “Knowledge Point-Instruction-Relation Type” triple representation is established. Based on this representation, NetworkX and Pyecharts are employed to generate network topology diagrams, interactive knowledge graphs, and word clouds for multi-dimensional visualization. Experimental results demonstrate that the proposed method effectively reveals the distribution patterns and logical associations of code knowledge, providing support for programming education, textbook analysis, and intelligent code analysis systems. This study verifies the feasibility of constructing a visualization system for Python code knowledge and offers new insights for teaching management, learning path planning, and intelligent programming tools.
文章引用:黄佳璐, 邓海生. 一种针对Python代码的提取与分析方法研究——理论与实证研究[J]. 计算机科学与应用, 2026, 16(1): 268-280. https://doi.org/10.12677/csa.2026.161022

参考文献

[1] Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M. and Bier, L. (1998) Clone Detection Using Abstract Syntax Trees. Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272), Bethesda, 20 November 1998, 368-377. [Google Scholar] [CrossRef
[2] Ferrante, J., Ottenstein, K.J. and Warren, J.D. (1987) The Program Dependence Graph and Its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9, 319-349. [Google Scholar] [CrossRef
[3] Allamanis, M., Brockschmidt, M. and Khademi, M. (2018) Learning to Represent Programs with Graphs. arXiv: 1711.00740.
[4] Fruchterman, T.M.J. and Reingold, E.M. (1991) Graph Drawing by Force Directed Placement. Software: Practice and Experience, 21, 1129-1164. [Google Scholar] [CrossRef
[5] Chen, Y., Zhang, M. and Wan, Y. (2020) A Survey on Graph Representation and Visualization Techniques. Journal of System Simulation, 32, 1232-1243.
[6] Paulheim, H. (2017) Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods. Semantic Web, 8, 489-508. [Google Scholar] [CrossRef
[7] 张甜甜. 基于数据结构的知识图谱构建及其可视化应用的研究[D]: [硕士学位论文]. 上海: 上海师范大学, 2020.
[8] Chen, Y., Wang, K., Zhang, J. and Yu, J. (2021) Educational Knowledge Graph Construction and Its Applications: A Comprehensive Survey. Applied Sciences, 11, 4027.
[9] 王昊, 刘挺, 孙乐. 知识图谱研究综述[J]. 中文信息学报, 2017, 31(3): 1-21.
[10] Allamanis, M., Barr, E.T., Devanbu, P. and Sutton, C. (2018) A Survey of Machine Learning for Big Code and Naturalness. ACM Computing Surveys, 51, 1-37. [Google Scholar] [CrossRef
[11] Qu, K., Li, K.C., Wong, B.T.M., Wu, M.M.F. and Liu, M. (2024) A Survey of Knowledge Graph Approaches and Applications in Education. Electronics, 13, 2537. [Google Scholar] [CrossRef
[12] Zhu, C., Zhang, Y., Fang, Y., et al. (2024) Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey. arXiv: 2402.05391.
[13] Shi, T., Kechagia, M., Georgiou, S., et al. (2021) A Survey on Machine Learning Techniques for Source Code Analysis. arXiv: 2110.09610.
[14] Wang, L., Sun, C., Zhang, C., Nie, W. and Huang, K. (2023) Application of Knowledge Graph in Software Engineering Field: A Systematic Literature Review. Information and Software Technology, 164, Article ID: 107327. [Google Scholar] [CrossRef