基于二分网络表示学习的开源项目推荐方法
Open Source Project Recommendation Method Based on Bipartite Network Representation Learning
DOI: 10.12677/CSA.2022.121007, PDF,    国家自然科学基金支持
作者: 林海铭, 田春岐:同济大学,计算机科学与技术系,上海;王 伟:华东师范大学,数据科学与工程学院,上海
关键词: 开源项目项目推荐二分网络表示学习Open Source Projects Projects Recommendation Bipartite Network Representation Learning
摘要: 随着开源生态的迅猛发展,越来越多的开发者加入到开源社区中,以开放、共享、协同的软件开发模式构建开源项目。许多代码托管平台如GitHub上托管着海量的开源项目,涵盖各类技术领域,开发者可以通过关键字搜索自己感兴趣的主题或项目名称进行浏览探索、加入或者代码重用。然而,开发者往往需要花费大量的时间和精力,才能通过关键词准确地描述自身的兴趣与目标项目的特征,找到符合自身需求的开源项目。针对这一问题,本文提出了一种基于二分网络表示学习的开源项目推荐方法。该方法结合开发者在参与项目贡献时的不同参与方式,构建出基于活跃度模型的开源贡献二分网络,并融合开发者与项目之间的显式关系和隐式关系,学习开发者的兴趣和技术偏好,向开发者进行个性化开源项目推荐。实验结果表明,本文所提的方法在推荐20个候选项目时的正确率超过42.37%,能够有效地为开发者推荐其感兴趣的开源项目。
Abstract: With the rapid development of the open source ecosystem, more and more developers have joined the open source community to build open source projects with an open, shared, and collaborative software development model. Many code hosting platforms, such as GitHub, host a large number of open source projects, covering various technical fields. Developers can search for topics or projects they are interested in by keywords to explore, contribute, or reuse code. However, developers often need to spend a lot of time and energy to accurately describe their interests and characteristics of target projects through keywords, and find projects that meet their own needs. To solve this problem, this paper proposes an open source project recommendation method based on bipartite network representation learning. This method combines the different participation methods of developers when participating in project contributions, constructs an open-source contribution bipartite network based on the activity model, and integrates the explicit and implicit relationships between developers and projects to recommend open source projects to developers. The experimental results show that the accuracy of the proposed method can achieve over 42.37% when recommending 20 candidate projects, which means it can effectively recommend closely correlated projects to developers.
文章引用:林海铭, 田春岐, 王伟. 基于二分网络表示学习的开源项目推荐方法[J]. 计算机科学与应用, 2022, 12(1): 54-62. https://doi.org/10.12677/CSA.2022.121007

参考文献

[1] Boyd, D.M. and Ellison, N.B. (2007) Social Network Sites: Definition, History, and Scholarship. Journal of Comput-er-Mediated Communication, 13, 210-230. [Google Scholar] [CrossRef
[2] Storey, M.A., Treude, C., van Deursen, A., et al. (2010) The Impact of Social Media on Software Engineering Practices and Tools. Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research, Santa Fe, 7-8 November 2010, 359-364. [Google Scholar] [CrossRef
[3] Begel, A., DeLine, R. and Zimmermann, T. (2010) Social Media for Software Engineering. Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research, Santa Fe, 7-8 November 2010, 33-38. [Google Scholar] [CrossRef
[4] Yang, C., Fan, Q., Wang, T., et al. (2019) RepoLike: A Mul-ti-Feature-Based Personalized Recommendation Approach for Open-Source Repositories. Frontiers of Information Technology & Electronic Engineering, 20, 222-237. [Google Scholar] [CrossRef
[5] Qing, Q., Jian, C. and Yancen, L. (2020) The Evolution of Software Ecosystem in GitHub. Journal of Computer Research and Development, 57, 513-524.
[6] Franco-Bedoya, O., Ameller, D., Costal, D., et al. (2017) Open Source Software Ecosystems: A Systematic Mapping. Information and Software Technology, 91, 160-185. [Google Scholar] [CrossRef
[7] Åhs, F. (2017) Evaluation of Memory Based Collaborative Filtering for Repository Recommendation on Github.
[8] Guendouz, M., Amine, A. and Hamou, R.M. (2015) Recommending Relevant Open Source Projects on Github Using a Collaborative-Filtering Technique. In-ternational Journal of Open Source Software and Processes (IJOSSP), 6, 1-16. [Google Scholar] [CrossRef
[9] 何锴琦, 马宇骁, 张炎, 刘华虓. 一种基于数据的GitHub项目个性化混合推荐方法[J]. 吉林大学学报(理学版), 2020, 58(6): 1399-1406.
[10] Sun, X., Xu, W., Xia, X., et al. (2018) Personalized Project Recommendation on GitHub. Science China Information Sciences, 61, Article ID: 050106. [Google Scholar] [CrossRef
[11] Zhang, Y., Lo, D., Kochhar, P.S., et al. (2017) Detecting Similar Repositories on GitHub. 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), Klagenfurt, 20-24 February 2017, 13-23. [Google Scholar] [CrossRef
[12] Yang, C., Fan, Q., Wang, T., et al. (2016) Repolike: Personal Repositories Recommendation in Social Coding Communities. Proceedings of the 8th Asia-Pacific Symposium on Inter-netware, Beijing, 18 September 2016, 54-62. [Google Scholar] [CrossRef
[13] Zhou, T., Wang, W. and Zhao, S. (2021) Open Source Galaxy: Heterogeneous Information Networks in Social Coding. 2021 IEEE 6th International Conference on Big Data Analytics (ICBDA), Xiamen, 5-8 March 2021, 349-355. [Google Scholar] [CrossRef
[14] 王伟, 周添一, 赵生宇, 范家宽. 全球开源生态发展现状研究[J]. 信息通信技术与政策, 2020(5): 38-44.
[15] Gao, M., Chen, L., He, X., et al. (2018) Bine: Bipartite Network Embedding. The 41st International ACM SIGIR Conference on Research & Development in Information Re-trieval, Ann Arbor, 8-12 July 2018, 715-724. [Google Scholar] [CrossRef
[16] Deng, H., Lyu, M.R. and King, I. (2009) A Generalized Co-Hits Algorithm and Its Application to Bipartite Graphs. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, 28 June-1 July 2009, 239-248. [Google Scholar] [CrossRef