结构化关键词的细粒度文献检索
Fine-Grained Literature Retrieval Based on Structuring Keywords
DOI: 10.12677/CSA.2019.98167, PDF,   
作者: 喻 伟, 戚文君:江苏方天电力技术有限公司,江苏 南京;戴 鹏*, 张 杰:东南大学计算机科学与工程学院,江苏 南京
关键词: 文献检索知识图谱依赖建模细粒度Document Retrieval Knowledge Base Dependency Modeling Fine-Grained
摘要: 智能化的检索需要对用户的查询进行意图识别,在科学文献检索领域,用户的潜在查询意图可分为面向问题和面向方法。本文以问题和方法为意图模版,提出一种利用实体信息对查询关键词进行结构化,进行查询意图解析及匹配的方法。具体为使用命名实体识别技术抽取实体及实体类型信息,表达查询意图,并利用马尔可夫随机场图模型建模查询、查询实体与文献的联合概率,进行匹配。实验结果表明,对关键词进行结构化能有效从上述两个角度建模用户的查询意图,从而对于不同的查询意图能够给出更精确、细粒度的检索结果。
Abstract: Intelligent retrieval requires intent recognition of the user’s query. In the field of scientific literature retrieval, the user’s potential query intent can be divided into problem-oriented and method-oriented. In this paper, we use the problem-oriented and method-oriented as the intent template, and propose a method to structure the query keywords by using the entity information, then to analyze and match the query intent. Specifically, named entity recognition technology is used to extract the entity and entity type information, express the query intent, and use the Markov Random Field graph model to model the query, query the joint probability of the entity and the document, and perform matching. The experimental results show that the structuring of keywords can effectively model the user’s query intent thus giving more accurate, fine-grained search results for different query intents.
文章引用:喻伟, 戴鹏, 张杰, 戚文君. 结构化关键词的细粒度文献检索[J]. 计算机科学与应用, 2019, 9(8): 1489-1499. https://doi.org/10.12677/CSA.2019.98167

参考文献

[1] Metzler, D. and Croft, W.B. (2005) A Markov Random Field Model for Term Dependencies. Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, 15-19 August 2005, 472-479. [Google Scholar] [CrossRef
[2] Wu, J., William, K., Chen, H.H., et al. (2015) CiteSeerX: AI in a Digital Library Search Engine. AI Magazine, 36, 35-48. [Google Scholar] [CrossRef
[3] Tang, J., Zhang, J., Yao, L., et al. (2008) ArnetMiner: Extraction and Mining of Academic Social Networks. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, 24-27 August 2008, 990-998. [Google Scholar] [CrossRef
[4] Lu, Z. (2011) PubMed and Beyond: A Survey of Web Tools for Searching Biomedical Literature. The Journal of Biological Databases and Curation, 2011, baq036. [Google Scholar] [CrossRef] [PubMed]
[5] Sinha, A., Shen, Z., et al. (2015) An Overview of Microsoft Aca-demic Service (MAS) and Applications. Proceedings of the 24th International Conference on World Wide Web, Florence, 18-22 May 2015, 243-246. [Google Scholar] [CrossRef
[6] Shen, J., Song, Z., et al. (2016) Modeling Topic-Level Academic Influence in Scientific Literatures. The Workshops of the Thirtieth AAAI Conference on Artificial Intelligence Scholarly Big Data: AI Perspectives, Challenges, and Ideas, Phoenix, AZ, 711-717.
[7] Ren, X., Shen, J., Qu, M., et al. (2017) Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences. Proceedings of ACL 2017, System Demonstrations, Vancouver, July 2017, 55-60. [Google Scholar] [CrossRef
[8] Guo, J., Xu, G., Cheng, X., et al. (2009) Named Entity Recognition in Query. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, 267-274. [Google Scholar] [CrossRef
[9] Xiong, C., Power, R. and Callan, J. (2017) Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding. The 26th International World Wide Web Conferences Steering, Perth, 3-7 April 2017, 1271-1279. [Google Scholar] [CrossRef
[10] He, T. and Dai, X. (2013) Pseudo-Relevance Feedback Query Based on Wikipedia. IEEE International Conference on Granular Computing, Hangzhou, 11-13 August 2012, 1-6. [Google Scholar] [CrossRef
[11] Dalton, J., Dietz, L. and Allan, J. (2014) Entity Query Feature Ex-pansion Using Knowledge Base Links. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, ACM, New York, 365-374. [Google Scholar] [CrossRef
[12] Liu, X. and Fang, H. (2015) Latent Entity Space: A Novel Retrieval Approach for Entity-Bearing Queries. Information Retrieval Journal, 18, 473-503. [Google Scholar] [CrossRef
[13] Ogilvie, P. and Callan, J. (2003) Combining Document Represen-tations for Known-Item Search. In: Proceedings of the 26th International ACM SIGIR Conference on Research and De-velopment in Information Retrieval, ACM, New York, 143-150. [Google Scholar] [CrossRef
[14] Zhai, C. and Lafferty, J. (2004) A Study of Smoothing Methods for Language Models Applied to Information Retrieval. ACM Transactions on Information Systems (TOIS), 22, 179-214. [Google Scholar] [CrossRef
[15] Bendersky, M., Metzler, D. and Croft, W.B. (2011) Parameterized Concept Weighting in Verbose Queries. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, 605-614. [Google Scholar] [CrossRef
[16] Ma, X. and Hovy, E. (2016) End-to-End Sequence Labeling via Bi-Directional LSTM-CNNS-CRF. Proceedings of the 54th Annual Meeting of the Association for Computational Lin-guistics, Berlin, 7-12 August 2016, 1064-1074. [Google Scholar] [CrossRef
[17] Augenstein, I., Das, M., Riedel, S., et al. (2017) SemEval 2017 Task 10: ScienceIE—Extracting Keyphrases and Relations from Scientific Publications. Proceedings of the 11th International Workshop on Semantic Evaluations, Vancouver, 3-4 August 2017, 546-555. [Google Scholar] [CrossRef
[18] Ceccarelli, D., Lucchese, C., Orlando, S., et al. (2013) Dexter: An Open Source Framework for Entity Linking. International Workshop on Exploiting Semantic Annotations in Information Re-trieval, San Francisco, 28 October 2013, 17-20. [Google Scholar] [CrossRef