IEEE Conference on Applications of Digital Informa- tion and Web Technologies

Com- parison between document-based, term-based and hybrid parti- tioning

作者:
A. Abusukhon M. P. Oakes M. Talib and A. M. Abdalla.

关键词:
query processing document handling information retrieval systems

摘要:
Information retrieval (IR) systems for largescale data collections must build an index in order to provide efficient retrieval that meets the user’s needs. In distributed IR systems, query response time is affected by the way in which the data collection is partitioned across nodes. There are three types of collection partitioning; document-based partitioning (called the local index), term-based partitioning (called the global index) and hybrid partitioning. In this paper, we compare the three types of partitioning in terms of average query response time for a system with one broker and six other nodes. Our results showed that within our distributed IR system, the document-based and hybrid partitioning outperformed the term-based partitioning. However, unlike Xi et al. [14], we did not find that hybrid partitioning was any better than document-based partitioning in terms of average query response time.

在线下载

相关文章:
在线客服:
对外合作:
联系方式:400-6379-560
投诉建议:feedback@hanspub.org
客服号

人工客服,优惠资讯,稿件咨询
公众号

科技前沿与学术知识分享